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Preface 


When I was an undergraduate student in Scotland in the late eighties working on 
seismic anisotropy I was lucky to have been able to attend an international meet- 
ing that took place at the University of Berkeley, California. I saw a presentation 
of a Stanford PhD student showing 2D simulations of anisotropic wave propa- 
gation using the finite-difference method. I was totally struck by the beauty of 
the graphics, the ease with which one could develop intuition about wave phe- 
nomena, and the elegant and simple maths underlying the simulation method. 
This became what I wanted to learn, master, and apply! Luckily, later I was of- 
fered a PhD position in Paris, where the development of finite-difference-based 
simulation methods and their application to inverse problems became my topic. 

It is important to note that at the time (parallel) computer codes needed for 
our research were basically written from scratch. The primary goal was to en- 
sure that the codes were correct (rather than making them readable by others 
through heavy commenting). As computers grew larger and architectures became 
more complex, this heroic (unprofessional) coding style no longer worked. Today, 
parallel codes must have a different quality. To be able to obtain computational 
resources on large supercomputer facilities, one has to demonstrate proper par- 
allel scaling properties and in many cases this is not possible without interaction 
with computational scientists. 

This implies a paradigm shift in the approach to simulation technology. Today, 
the shortest time to results for students involves the use of community software 
provided by projects like CIG (<http://www.geodynamics.org>), individual re- 
searchers or groups (see appendix), or community platforms as developed within 
the VERCE project (<http://www.verce.eu>). This creates problems. Numerical 
methods are not necessarily featured with much detail in Earth science courses. 
On the other hand, it is usually straightforward to use community simulation tools 
and obtain synthetic seismograms. However, without experience, quality control 
is difficult. Not seldom am I presented with simulation results where there are 
obvious problems with the set-up. 

How can we fix this? Students and researchers should have at least a basic un- 
derstanding of what is under the bonnet of current simulation technologies used 
to solve interesting research problems. They should understand what problems 
to look out for, and how to properly design simulation tasks, and to ensure the 
results are correct. In addition, today there is a zoo of different methods, and it is 
difficult to choose the right method for a particular problem. In this volume I try 
to provide some guidelines. 


viii 


Preface 


The strategy is to keep the maths as simple as possible, extensively using 
graphics to illustrate concepts, while at the same time presenting the link between 
theory and computer program (using the Python language and Jupyter note- 
books). This concept should be beneficial to both students and lecturers. While 
it is advisable to write own codes as much as possible (and compare with the so- 
lutions presented here), lecturers can start right away using the supplementary 
electronic material and the online platform provided. 

This volume should be considered a starting point. There are many excellent 
books for each of the numerical methods presented in this text. These references 
should be consulted when more detail is required. The focus here is to present 
the fundamental concepts of the various methods, their inter-relations, and pros 
and cons for specific applications in seismology (and other fields). 

My hope is that you become equally excited about this fascinating field of 
Earth science, and use your knowledge to further our understanding of this 
amazing planet! 


Heiner Igel 
Munich, July 2016 


Acknowledgements 


Thanks to Sonke Adlung from Oxford University Press for suggesting this 
project, to Ania Wronski and the production staff at Integra (Marie Felina 
Francois) for their help and support, and to Henry MacKeith for copy-editing. 

I would like to express my gratitude to those who helped me to get off the 
ground in science. Stuart Crampin of the University of Edinburgh, taught me to 
‘think science’, and became a lifelong friend. This volume would never have been 
possible without the vision of Albert Tarantola and Peter Mora of the Institut 
de Physique du Globe Paris, who—decades ago—foresaw the impact of parallel 
computing in the Earth (and other) sciences. I consider myself very lucky having 
had these scientists as supervisors. 

The work presented in this volume benefitted from research projects funded 
by the German Research Foundation, the European Union, the European Re- 
search Council, the Bavarian Government, the German Ministry of Research, the 
Volkswagen Foundation, and the European Science Foundation. I gratefully ac- 
knowledge the strong support from the Leibniz Supercomputing Centre Munich. 

The concepts for this volume were born out of workshops organized dur- 
ing the SPICE and QUEST training networks funded by the European Union 
between 2003 and 2013. The people who invented the training network fund- 
ing instruments should be awarded! We learned so much through these projects 
on both seismic forward and inverse modelling, which pushed the limits in 
seismology substantially, and at the same time shaped careers for dozens of young 
scientists who today are scattered around the world, many in senior positions. 


Infinite thanks to the principal investigators and associates of the SPICE 
and QUEST projects—they were so much fun: Chris Bean, Lapo Boschi, Jo- 
hana Brokesova, Michel Campillo, Torsten Dahm, Ana Ferreira, Domenico 
Giardini, Alex Goertz, Matthias Holschneider, Raul Madariaga, Martin Mai, 
Valerie Maupin, Peter Moczo, Jean-Paul Montagner, Andrea Morelli, Tarje 
Nissen-Meyer, Guust Nolet, Johan Robertsson, Barbara Romanowicz, Malcolm 
Sambridge, Geza Seriani, Karin Sigloch, Eleonore Stutzmann, Jeannot Trampert, 
Colin Thomson, Jeroen Tromp, Jean-Pierre Vilotte, Jean Virieux, John Wood- 
house, Aldo Zollo, and many others; the enthusiastic administrators Erika Vye 
and Greta Kitippers; and all doctoral and postdoctoral researchers involved. 

Part of the material benefitted enormously from other people’s work; for exam- 
ple, Bernhard Schuberth’s diploma thesis, course material by Martin Kaser, and 
Andreas Fichtner’s book on modelling and inversion. Thanks to Peter Shearer 
and Cambridge University Press for giving permission to use some of the graph- 
ics from Peter’s excellent introductory work in seismology. I also want to thank 
Wolfgang Bangerth who introduced me to the finite-element method many years 
ago using just the blackboard and a pen. 

This volume got started during the phenomenal RHUM-RUM cruise in the 
Indian Ocean in autumn 2013, coordinated by Karin Sigloch and Guilhem Bar- 
ruol. Thanks to Yann Capdeville with whom I shared many day and night shifts. 
He helped me a lot with the spectral-element method. According to him I was his 
worst student ever. 

Thanks to Florian W6lfl, who helped tremendously getting the Latex project 
started. Sebastian Anger, Bryant Chow, Jonas Igel, Lion Krischer, David Vargas, 
and Moritz Goll helped with the graphics, slide material, and the notebooks. 
Special thanks to Stephanie Wollherr: her mathematical skills and good humour 
helped substantially in getting the finite-volume and discontinuous Galerkin chap- 
ters (and the codes) in shape. Sujana Talavera provided great comments on these 
parts, too. I am also grateful to Matthias and Thomas Meschede for creating the 
graphics for the title page. 

Thanks to all the participants of the Munich Earth Skience School 2015 
(<http://www.geophysik.lImu.de/MESS>) in Sudelfeld, Christine Thomas for col- 
lecting all the comments and the staff Renate and Winfried L6ffler and Michael 
Sponi for always creating a wonderful atmosphere over the years. Thanks to the 
participants of the 2015 seminar on computational seismology and all their com- 
ments and suggestions on the draft: Michael Bader, Esteban Bedoya, Christoph 
Heidelmann, Eduard Kharitonov, Jiunn Lin, Martin Mai, Sneha Singh, Taufiq 
‘Taufiqurrahman, Tushar Upadhyay, Vasco Varduhn, and Donata Wardani. 

Thanks to Toshiro Tanimoto, Alain Cochard, and Nicolas Brantut who went 
through the manuscript and provided many useful comments and corrections. 
Thanks to Josef Kristek and Peter Moczo for reviewing the chapter on the finite- 
difference method, and making many excellent suggestions. 

Special thanks to Lion Krischer and Tobias Megies for pushing me to- 
wards Python and the Jupyter Notebooks, opening new ways for the training of 
numerical methods in seismology, and bringing <http://seismo-live.org> to live. 


Preface 


ix 


x 


Preface 


Many have helped with the project, in various forms (comments, graph- 
ics, ideas, codes): Robert Barsch, Moritz Bernauer, Jacobo Bielak, Alex Breuer, 
Emanuelle Casarotti, Josep de la Puente, Michael Dumbser, Kenneth Duru, 
Michael Ewald, Ana Ferreira, Bob Geller, Sarah Hable, Celine Hadziioannou, 
Kasra Hosseini, Alice Gabriel, Verena Herrmann, Gunnar Jahnke, Brian Kennett, 
Lane Johnson, Fabian Lindner, Dave May, Christian Pelties, Michael Reinwald, 
Johannes Salvermoser, Stefanie Schwarz, Robert Seidl, Geza Seriani, my wife 
Maria Stange, Simon Stéhler, Marco Stupazzini, Ulrich Thomas, Jeroen Tromp, 
Maria Tsekhmistrenko, Martin van Driel, Jean-Pierre Vilotte, Haijiang Wang, 
Joachim Wassermann, Moritz Wehler, Stefan Wenk, and Djamel Ziane. Thanks 
to the Geophysics IT and HPC team, Jens Oser, Gerald Schroll, and Marcus 
Mohr for their support. 

Thanks to Tamiko Thiel for sharing stories with me about the early days of 
Thinking Machines and the design of the parallel Connection Machine. 

I greatly appreciate the help of my Edinburgh flatmate and friend Sean 
Matthews, who proofread the entire manuscript for language problems. 

Thanks to Elisabeth and Ernst Ullmann for providing me with the right Col- 
nago frames for my cycling workout, and for their friendship. Finally, I would like 
to thank my family: my father Hans, Inge, my sister Barbel, Jiirgen, Moritz, Fe- 
lix, and my parents-in-law Ortrud and Karl, for their support and love. In loving 
memory of my mother. 

Thanks also to all those whom I have forgotten to acknowledge. 


Contents 


1 About Computational Seismology 


1.1 What is computational seismology? 
1.2. What is computational seismology good for? 
1.3. ‘Target audience and level 
1.4 How to read this volume 
1.5 Code snippets 
Further reading 


Part | Elastic Waves in the Earth 


2 Seismic Waves and Sources 


2.1 
2.2 
2.3 


2.4 


2:5 


2.6 


2.7 
2.8 
2.9 


Elastic wave equations 

Analytical solutions: scalar wave equation 
Rheologies 

2.3.1. Viscoelasticity and attenuation 
2.3.2 Seismic anisotropy 

2.3.3 Poroelasticity 

Boundary and initial conditions 

2.4.1 Initial conditions 

2.4.2 Free surface and Lamb’s problem 
2.4.3 Internal boundaries 

2.4.4 Absorbing boundary conditions 
Fundamental solutions 

2.5.1 Body waves 

2.5.2 Gradient, divergence, curl 

2.5.3 Surface waves 

Seismic sources 

2.6.1 Forces and moments 


2.6.2 Seismic wavefield of a double-couple point source 


2.6.3 Superposition principle, finite sources 
2.6.4 Reciprocity, time reversal 

Scattering 

Seismic wave problems as linear systems 
Some final thoughts 


Chapter summary 


Further reading 


Exercises 


e 


ON DUNN 


13 


17 
19 
22 
22 
23 
24 
25 
25 
26 
27 
27 
28 
28 
28 
29 
31 
31 
33 
36 
38 
39 
42 
43 
44 
45 
45 


xii Contents 


3 Waves in a Discrete World 


3.1 
3.2 
3.3 


3.4 
3.5 


3.6 The impact of parallel computing on Earth Sciences 


Classification of partial differential equations 
Strategies for computational wave propagation 
Physical domains and computational meshes 
3.3.1 Dimensionality: 1D, 2D, 2.5D, 3D 
3.3.2 Computational meshes 

3.3.3. Structured (regular) grids 

3.3.4 Unstructured (irregular) grids 

3.3.5 Other meshing concepts 

The curse of mesh generation 

Parallel computing 

3.5.1 Physics and parallelism 

3.5.2 Domain decomposition, partitioning 


3.5.3. Hardware and software for parallel algorithms 


3.5.4 Basic hardware architectures 
3.5.5 Parallel programming 
3.5.6 Parallel I/O, data formats, provenance 


Chapter summary 


Further reading 


Exercises 


Part Il Numerical Methods 


4 The Finite-Difference Method 


4.1 
4.2 
4.3 


4.4 


4.5 


4.6 


History 

The finite-difference method in a nutshell 
Finite differences and Taylor series 

4.3.1 Higher derivatives 

4.3.2 High-order operators 

Acoustic wave propagation in 1D 

4.4.1 Stability 

4.4.2 Numerical dispersion 

4.4.3 Convergence 

Acoustic wave propagation in 2D 

4.5.1. Numerical anisotropy 

4.5.2 Choosing the right simulation parameters 
Elastic wave propagation in 1D 

4.6.1 Displacement formulation 

4.6.2 Velocity—stress formulation 

4.6.3 Velocity—stress algorithm: example 
4.6.4 Velocity—stress: dispersion 


49 


49 
51 
53 
53 
54 
55 
56 
57 
58 
59 
61 
62 
62 
63 
64 
67 
68 
69 
70 
70 


kD 


75 
77 
78 
79 
81 
82 
87 
88 
89 
90 
91 
92 
95 
95 
96 
98 
99 


4.7 


4.8 


Elastic wave propagation in 2D 

4.7.1 Grid staggering 

4.7.2 Free-surface boundary condition 

The road to 3D 

4.8.1 High-order extrapolation schemes 
4.8.2 Heterogeneous Earth models 

4.8.3 Optimizing operators 

4.8.4 Minimal, triangular, unstructured grids 
4.8.5 Other coordinate systems 

4.8.6 Concluding remarks 


Chapter summary 


Further reading 


Exercises 


5 The Pseudospectral Method 


5.1. History 
5.2 The pseudospectral method in a nutshell 
5.3 Ingredients 
5.3.1. Orthogonal functions, interpolation, derivative 
5.3.2 Fourier series and transforms 
5.4 The Fourier pseudospectral method 
5.4.1 Acoustic waves in 1D 
5.4.2 Stability, convergence, dispersion 
5.4.3 Acoustic waves in 2D 
5.4.4 Numerical anisotropy 
5.4.5 Elastic waves in 1D 
5.5 Infinite order finite differences 
5.6 The Chebyshev pseudospectral method 
5.6.1. Chebyshev polynomials 
5.6.2 Chebyshev derivatives, differentiation matrices 
5.6.3. Elastic waves in 1D 
5.7. The road to 3D 
Chapter summary 
Further reading 
Exercises 


6 The Finite-Element Method 


6.1 
6.2 
6.3 


History 

Finite elements in a nutshell 

Static elasticity 

6.3.1. Boundary conditions 

6.3.2 Reference element, mapping, stiffness matrix 
6.3.3 Simulation example 


Contents 


xiii 
101 
101 
102 
103 
103 
105 
105 
107 
108 
109 
109 
110 
111 


116 


117 
118 
119 
119 
121 
127 
127 
130 
131 
132 
133 
134 
138 
139 
144 
146 
147 
148 
149 
149 


153 


154 
155 
156 
160 
160 
162 


xiv Contents 


6.4 


6.5 
6.6 
6.7 


1D elastic wave equation 
6.4.1 The system matrices 
6.4.2 Simulation example 
Shape functions in 1D 
Shape functions in 2D 

The road to 3D 


Chapter summary 


Further reading 


Exercises 


7 The Spectral-Element Method 


7.1 
7.2 
7.3 
7.4 


7.5 
7.6 
7.7 


7.8 


History 

Spectral elements in a nutshell 

Weak form of the elastic equation 

Getting down to the element level 

7.4.1 Interpolation with Lagrange polynomials 
7.4.2 Numerical integration 

7.4.3 Derivatives of Lagrange polynomials 
Global assembly and solution 

Source input 

The spectral-element method in action 

7.7.1. Homogeneous example 

7.7.2 Heterogeneous example 

The road to 3D 


Chapter summary 


Further reading 


Exercises 


8 The Finite-Volume Method 


8.1 
8.2 
8.3 
8.4 
8.5 


8.6 
8.7 


History 

Finite volumes in a nutshell 

The finite-volume method via conservation laws 
Scalar advection in 1D 

Elastic waves in 1D 

8.5.1 Homogeneous case 

8.5.2 Heterogeneous case 

8.5.3. The Riemann problem: heterogeneous case 
Derivation via Gauss’s theorem 

The road to 3D 


Chapter summary 


Further reading 


Exercises 


164 
167 
169 
173 
175 
177 
178 
179 
179 


182 


183 
184 
185 
188 
191 
193 
196 
197 
199 
199 
199 
204 
205 
206 
207 
208 


211 


212 
213 
214 
220 
223 
225 
228 
231 
233 
234 
235 
236 
236 


9 The Discontinuous Galerkin Method 


9.1 History 
9.2 The discontinuous Galerkin method in a nutshell 
9.3 Scalar advection equation 
9.3.1 Weak formulation 
9.3.2 Elemental mass and stiffness matrices 
9.3.3 The flux scheme 
9.3.4 Scalar advection in action 
9.4 Elastic waves in 1D 
9.4.1. Fluxes in the elastic case 
9.4.2 Simulation examples 
9.5 The road to 3D 
Chapter summary 
Further reading 
Exercises 


Part Ill Applications 


10 Applications in Earth Sciences 


10.1. Geophysical exploration 

10.2 Regional wave propagation 

10.3 Global and planetary seismology 

10.4 Strong ground motion and dynamic rupture 
10.5 Seismic tomography—waveform inversion 
10.6 Volcanology 

10.7. Simulation of ambient noise 

10.8 Elastic waves in random media 

Chapter summary 


Exercises 


1 


= 


Current Challenges in Computational Seismology 


11.1 Community solutions 
11.2 Structured vs. unstructured: homogenization 
11.3. Meshing 


11.4 Nonlinear inversion, uncertainties 


Appendix A Community Software and Platforms in Seismology 


A.1 Wave propagation and inversion 

A.2 Data processing, visualization, services 
A.3 Benchmarking 

A.4 Jupyter Notebooks 

A.5 Supplementary material 


References 


Index 


Contents 


xv 


239 


240 
242 
243 
245 
247 
249 
251 
255 
257 
260 
262 
264 
264 
265 


269 


271 
273 
275 
278 
281 
285 
287 
288 
289 
290 


291 


291 
292 
293 
294 


295 
295 
297 
298 
299 
299 
303 
321 


About Computational 
Seismology 


1.1 What is computational seismology? 


For seismologists the calculation of synthetic (or theoretical) seismograms is a key 
activity on the path to a better understanding of the structure of the Earth’s 
interior, or the sources of seismic energy. There are many ways of doing this, 
depending in particular on the assumptions made in the geophysical model. 
In the most general case—an Earth in which the properties vary in three 
dimensions—analytical solutions do not exist. 

The complete solution of the governing 3D partial differential equations de- 
scriptive of elastic wave propagation requires the adaptation of numerical methods 
developed in the field of applied mathematics. For the purpose of this volume I 
define computational seismology such that it involves the complete numerical solu- 
tion of the seismic wave-propagation problem for arbitrary 3D models (a pictorial 
example is shown in Fig. 1.1). A further restriction is that we focus on so-called 
time-domain solutions rather than frequency-domain approaches. In time-domain 
approaches the space-dependent seismic wavefield is extrapolated time step after 
time step into the future. In frequency-domain approaches the wave equations 
are transformed into the spectral domain and solved for each frequency. Seismo- 
grams can then be obtained by inversely transforming the spectra into the time 
domain. Another numerical approach that is not discussed here is the boundary- 
integral-equation method that is being used in fracture mechanics. Time domain 
solutions are the most commonly used computational tools today to calculate 
seismograms and to solve seismic inverse problems in 3D. 

The definition of the field of computational seismology would be incomplete 
without making a reference to the mind-boggling evolution of computational 
hardware. Your smartphone today is almost as powerful as the supercomputers 
that were used when I started my PhD in 1990 at the Institut de Physique du 
Globe in Paris. At that time we were lucky to be able to lay our hands on one of 
the first (then called massively) parallel computers, built by the company Thinking 
Machines Corp. 

The calculation of seismograms for any even vaguely realistic problem in 3D 
is computationally expensive. This implies that the actual zmplementation of a nu- 
merical algorithm that solves the seismic wave-propagation problem is involved, 
requires so-called parallelization, and is potentially strongly hardware-dependent. 


Computational Seismology. First Edition. Heiner Igel. 
© Heiner Igel 2017. Published in 2017 by Oxford University Press. 


1.1 What is computational seismology? 1 


1.2 What is computational seismology 
good for? 


1.3 Target audience and level 
1.4 How to read this volume 


1.5 Code snippets 


onan nN 


Further reading 


Fig. 1.1 Snapshot of global seismic 
wave propagation through a 3D man- 


tle convection model. Inside the Earth 
iso-velocity surfaces indicate flow pat- 
terns. The seismic wavefield—simulated 
with the spectral-element method—is 
indicated with green and pink colours. 
Figure courtesy of B. Schuberth. 


' Tn his later life legendary physicist and 
Nobel Prize winner Richard Feynman de- 
signed the communication scheme for this 
machine. 
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Fig. 1.2 Source-receiver raypaths for 
earthquake-receiver configuration in- 
volving part of the US-Array portable 
network (stations shown as small grey 
triangles). The figure illustrates a pro- 
jection of the ray density to the Earth’s 
surface, giving a qualitative indication 
of how well structure could be recov- 
ered with tomographic inversion. Figure 
courtesy L. Krischer. 


? You might be able to write down quasi- 
analytical solutions with pen and paper 
but eventually when you do the compu- 
tations some approximations are required 
(e.g. when calculating series). Therefore 
the term quasi. 


It is beyond the scope of this volume to deal with implementation issues. How- 
ever, the necessity to parallelize numerical solutions has had an important impact 
on the design, evolution, and survival of specific approaches. 

There are several other—let us call them classic—ways to calculate synthetic 
seismograms, and there are excellent textbooks on many of them (see end of 
this chapter). Each of these methods is making specific assumptions that—in 
general—are not necessary when applying numerical methods to the wave- 
propagation problem. For example, almost everything we know today about the 
Earth’s interior, much of the dynamics of the mantle, and the use of our planets’ 
(once) gigantic energy resources (e.g. hydrocarbons) is based on the observations 
of arrival (travel) times of seismic phases that are analysed with ray theory. 

Ray theory is based on the assumption that high (even infinite) frequencies are 
travelling through the Earth and along their way only see long-wavelength struc- 
tures (for an illustration see Fig. 1.2). The advantage is that these calculations are 
very fast and thus allow an efficient solution of the inverse problem (ray tomog- 
raphy). In recent years, the inclusion of finite-frequency effects into the calculation 
of arrival times and ray-based synthetic seismograms has led to an exciting novel 
approach to the waveform inversion problem at the high-frequency end of the 
seismic spectrum. 

Other so-called modal solutions are quasi-analytical’ solutions to the governing 
partial differential equations mostly based on the assumption of one-dimensional 
variation of the parameters (e.g. seismic velocities only vary with depth and are 
laterally homogeneous). On a global scale the normal-mode solutions based on 
spherical harmonics fall into this category. In Cartesian coordinates the reflectivity 
method for layered Earth models is another example. The two aforementioned 
methodologies are still workhorses for many applications. 

At this point I would like to make a bold statement: Even though this is a 
volume on computational seismology (in the way I have defined it), it is important 
to stress that the classic methods just described will continue to play an extremely 
important role. To understand certain parts of the seismogram it is sometimes 
advantageous (and computationally cheaper) to use classic methods. Also, when 
verifying whether your computer program does the right thing the only way to 
find out is to compare it with well-established (quasi-) analytical solutions, where 
this is possible. 

In the light of this—and this is part of the rationale for this volume—uwuse 3D 
simulation tools with care, and make sure you also gain experience with other classic 
techniques. This will help you to develop an in-depth understanding of seismic 
wave propagation, and to efficiently solve your research problem. 


1.2 What is computational seismology 
good for? 


So, you want to calculate a seismogram? Whether you are an exploration 
geophysicist, global seismologist, volcanologist, rock physicist, or a seismologist 
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working in an insurance company trying to quantify hazard, you are faced with 
the question of which numerical approach to choose for your particular problem. 
There is no simple answer to this question, and that is a key reason for writing 
this volume. It is not likely that one approach will develop into the one-and-only 
solution for all wave-propagation problems. Therefore it is important to under- 
stand some of the properties of the various schemes in order to make the right 
choice. 

In the following I want to briefly highlight the role of 3D simulations in the var- 
ious application domains and will point to some of the issues involved. A detailed 
discussion of applications can be found in Part III of this volume. 

In the absence of any serious hope of predicting earthquakes in the near future, 
the calculation of the ground shaking for potential earthquake scenarios is one of 
the most important strategies in ground-shaking hazard studies. In order for these 
calculations to be useful for earthquake engineers, they have to reach frequencies 
that are relevant for building responses. This is hard to do given the uncertainties 
of the near-surface small-scale structure, which, however, might have important 
effects on the shaking characteristics. Already, this implies that we have strong ma- 
terial variations in the model, indicating that we might have to use a methodology 
that allows varying the grid density across the model. 

Equally relevant for hazard studies is the field of earthquake physics in which 
we seek an understanding of the processes governing seismic rupture. While 
in general seismology is a data-rich science, this is less the case for earthquake 
source studies. Even though we now have some very well-recorded large seismic 
events (like the M9.0 Tohoku-Oki earthquake in Japan in 2011), we are lacking 
observations close to the source that would allow us to put tighter constraints 
on the rupture properties. In this field it is important to be able to properly 
implement the specific frictional boundary conditions that act on a fault. Find- 
ing accurate and efficient solutions to this problem is still a very active research 
field. 

Active source seismology in exploration geophysics usually aims at generating 
body waves and avoiding surface waves if possible. The focus on body waves has 
some consequences on the choice of an appropriate solver. For some methods the 
implementation of accurate free-surface boundary conditions is more difficult. 
In reservoir situations it might also be necessary to involve more complicated 
rheologies such as anisotropy and poroelasticity, or to generate meshes for highly 
complex models (see Fig. 1.3). 

For seismic wave propagation beyond a certain scale (* 1,000 km) the Earth’s 
curvature can no longer be neglected. This concerns regional or continental wave 
propagation as well as global or planetary seismology. Spherical coordinates would 
be one method, but there are certain restrictions concerning the model domain. 
An alternative is the use of the cubed-sphere approach, which is applicable for both 
regional and global wave propagation and has been very successfully employed 
using spectral-element or discontinuous Galerkin methods. 

Other domains of application include volcanology. On volcanoes a tremen- 
dously wide spectrum of seismic signals is observed, and modelling them is 


Fig. 1.3 Exploration problems. Rectan- 
gular mesh for the Marmousi bench- 
mark model for reservoir simulations. 
The mesh honours fine layering and 
internal faults. Figure from Capdeville 
et al. (2015). 
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Fig. 1.4 One of our ultimate goals is 
to reduce the misfit between observations 
(black traces) and theoretical seismo- 
grams (red traces). In this case, seis- 
mograms are compared for the dam- 
aging M6.3 PAquila, Italy earthquake 
in 2009 (yellow star) at several Euro- 
pean stations (red triangles) with calcu- 
lations using the discontinuous Galerkin 
method. From Wenk et al. (2013). 


a challenging task. Depending on the frequency range involved, the often 
substantial topography needs to be incorporated. ‘Topography combined with 
a free-surface boundary condition is something that is basically impossible to 
implement with some methods, but easy with others. 

Finally, it is important to mention the field of sezsmic tomography and in partic- 
ular full waveform inversion. We are currently going far beyond extracting only 
a few bytes of information from observed seismograms like travel times and 
explaining them using ray theory. The future is in matching complete wave- 
forms, that is, calculating synthetic seismograms through 3D models and directly 
comparing them with observations. An example comparing theory with obser- 
vations is given in Fig. 1.4. Full waveform inversion is an iterative procedure 
that requires the calculation of a great many forward problems, progressively im- 
proving the fit between synthetic and observed seismograms. I predict that in a 
few years from now we will routinely do full waveform inversion on all scales 
with tremendously improved images of the Earth’s interior. Eventually it might 
lead to new theories on how our planet Earth works. But there is a long way 
to go! 

This volume aims at providing you with a very basic introduction to the var- 
ious numerical methods that are currently used for seismic wave-propagation 
problems. The focus is on the fundamental principles of the methods that 
lead to the properties that are relevant to the question which solver works best 
for a certain problem. To present a complete 3D implementation goes beyond 
the scope of this volume. Relevant references are provided at the end of each 
chapter. 


Observed seismogram 
Synthetic seismogram 


KONO 


PSZ 


KIEV 


TIRR 


TAM MATE 


1.3 Target audience and level 


This volume is based on lectures given by the author and his colleagues at the 
Ludwig-Maximilians University, Munich, as well as in short courses at foreign 
institutions and during the workshops of two EU-funded training networks, 
SPICE (2003-7) and QUEST (2009-13) in the field of computational seismol- 
ogy. The volume can be used as a basis for a one-semester or two-semester course 
for senior undergraduates, or junior postgraduates of physics, Earth sciences, 
or engineering with sufficient background in mechanics and analysis. It also ad- 
dresses experienced researchers who intend to use some of the community codes 
that are on offer today (see the Appendix), and seek a quick overview of numerical 
methods applied to the elastic wave equation. 

Experience with teaching numerical methods and guiding students who are 
starting to using community codes for 3D seismic wave propagation for many 
years led me to conclude that an excellent preparation for a research project in com- 
putational seismology with any method is to start by coding the wave equation from 
scratch in 1D and to explore its capabilities and traps. With some of the methods this 
is done in almost no time, while with others this simple problem might already be 
quite involved. Yet, it is worth the effort! 

To help achieve this goal substantial supplementary electronic material is 
provided (see Appendix). This involves elementary ingredients of numerical 
methods such as finite-differencing, numerical integration, and function approx- 
imation, as well as complete 1D (some 2D) solutions for each of the numerical 
methods introduced. I strongly recommend that you first try out coding a solu- 
tion yourself, and then consult the solutions! Alternatively, the codes can be used 
as a starting point to solve the many computer exercises given at the end of each 
chapter. 

To enable easy, fast, and direct access to the practical material, computer codes 
are provided as Python-based Fupyter notebooks through a dedicated server? (i.e. 
no download necessary; codes can be run anywhere with internet access, even on 
your smart phone). This should enable lecturers to easily and immediately use 
the volume plus the supplementary material for teaching. Many of the codes are 
also available in Matlab®. 

This is neither a sezsmology volume nor a maths volume. It is somewhat in 
between. The term practical in the title refers to the priority given to presenting 
the numerical algorithms for each method in combination with implementation 
in a computer code. Sometimes the path from a mathematical algorithm to a code 
can be painful, in particular if one has to rely on algorithms presented in research 
papers where details are often omitted. 

The mathematical background required to understand the numerical tools 
presented here strongly depends on the methods themselves. We use elements of 
calculus, linear algebra, functional analysis, and partial differential equations that 
should be covered in undergraduate or postgraduate lectures like Mathematical 
Methods for Earth Scientists (Physicists, Engineers). These elements include 
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e ‘Taylor series, Fourier series 

e Fourier transforms, convolution theorem 

e Exponential functions, complex numbers 

e Function interpolation 

e Polynomial functions (e.g. Chebyshev, Lagrange, Legendre) 
e Numerical integration 

e Vector, matrix calculations 

e Eigenvector analysis 


e Vector field operations (curl, div, grad) 


It might, therefore, be useful to have your favourite maths handbook at hand. 

The specific sequence of numerical methods presented—from simple to more 
complicated—somehow also reflects the chronology of the evolution of numerical 
methods applied to wave-propagation problems, and is intentional. 


1.4 How to read this volume 


The volume is divided into three parts. 

Part I serves as an introduction to fundamental aspects of seismic wave prop- 
agation, the discrete world, and computations. These chapters are written with 
a clear focus on what is relevant when you use numerical solutions to wave- 
propagation problems. What are the governing equations? What are analytical 
solutions to simple problems? How do we describe seismic sources? What bound- 
ary conditions apply? In addition, we introduce some basic concepts of describing 
wavefields in a discrete way and the consequences for large-scale computations. 

Part II is the heart of the volume, with six chapters on specific numerical 
methods. The finite-difference method is covered in Chapter 4 in quite some de- 
tail, including an analytical way of treating the numerical approximations (von 
Neumann analysis), leading to some fundamental results (e.g. stability crite- 
rion, numerical anisotropy) which are also relevant for the other numerical 
methods. All methods rely on a finite-difference type approximations for the 
time-dependent (extrapolation) part. Therefore, this chapter is essential. 

In the chapter on the pseudospectral method, the concepts of exact interpo- 
lation and cardinal functions are introduced. These will play an important role 
later in high-order Galerkin methods. Both Fourier and Chebyshev methods are 
presented. 

Despite their intimate relation, there are separate chapters on the finite-element 
and the spectral-element method. The finite-element method is introduced in the 
simplest possible form using the static elastic case with linear basis functions. The 
spectral-element method is developed using Lagrange polynomials and Gauss 
integration leading to the attractive explicit scheme that is so popular today. 


The chapter on the finite-volume method takes the scalar advection equation as 
a starting point, and shows that it is formally equivalent to the problem of wave 
propagation. The concept of numerical fluxes is introduced, based on analytical 
solutions of the Riemann problem. The finite-volume method is also discussed 
as a direct application of Gauss’s theorem, allowing numerical solutions for finite 
volumes of arbitrary shape. 

Finally, the most recent numerical approach is the discontinuous Galerkin 
method, which joins the best part of the spectral-element method with the flux 
scheme developed for finite-volume method. We focus on the nodal form of 
the method using the same Lagrange polynomial basis functions as for the 
spectral-element method. 

One of the fascinating aspects of this zoo of numerical methods, which some- 
times have entirely different starting points, is the fact that, in their simplest 
(inear) form, most of them are basically identical. Whenever possible this is high- 
lighted in the text. Nevertheless, their pros and cons become apparent as soon as 
more realistic simulation scenarios are envisaged. 

The many domains of applications of the methods presented in this volume 
are discussed in Part IJ. For each application, domain-specific requirements are 
presented with some indications as to which methodologies might work best. This 
is complemented by a discussion of some open issues and current hot topics in 
computational seismology. 

In the Appendix, a list of links to currently accessible community numeri- 
cal solvers for seismic wave-propagation problems, data analysis, visualization, 
and data access is provided. Furthermore, access to the supplementary electronic 
material and some information on content is detailed. 

The technical chapters on the various methods in Part II each have the same 
structure. A brief historical overview is followed by presenting the method in 
a nutshell, highlighting its most important aspects. Each chapter then explores 
the details of the method in question. Towards the end, a Road to 3D section 
provides some information on where to find readable complete 3D algorithms in 
the literature. A Summary is then given, followed by a Further reading section. The 
chapter then concludes with Exercises, divided into (1) comprehension questions, 
(2) theoretical problems, and (3) programming exercises. This structure makes 
the technical chapters quite self-contained, with the drawback that there is some 
repetition. 


1.5 Code snippets 


The volume contains fragments of Python codes taken from the supplementary 
material that can be downloaded or run as interactive Jupyter notebooks online 
(<http://www.seismo-live.org>). It is important to note that they are presented in 
a style that optimizes readability in the sense that someone not so familiar with 
Python (and maybe more familiar with Matlab®) can understand what’s going 
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on. In any case it is advisable to spend some time with the Python introductory 
material available online at the site given above. 

Nevertheless, a few important points are given here. Even though it is 
considered bad practice, we import (sub-) libraries such as numpy using the 
commands 


from numpy import * 
from numpy.fft import * 


This allows us to use Matlab® style calls to intrinsic routines such as 


# Fast Fourier transform of vector f£ 
F = fft(f£) 


or to initialize vectors with 
time = linspace(0, nt * dt, num=nt) 


A feature that enables quite dense coding and mimics the implicit matrix-vector 
calculations of Matlab® is the @ sign used in Python 3.5 versions (or higher). For 
example, 


from numpy.linalg import * 
# [...] Initialize matrices 
A =R @UL @® inv(R) 


would correspond to matrix operations for an eigenvalue problem 
A = RLR’, 


where A is a square matrix, R contains its eigenvectors, and L is a diagonal matrix 
with eigenvalues. 

The material presented here is about waves that propagate. It is so much fun 
bringing life to the figures in this volume: seeing waves propagate and scatter after 
writing your own code or using the available supporting material. By doing so you 
will be able to better understand the underlying numerical mathematics and learn 
wave-propagation phenomena. Have fun! 


FURTHER READING 


There follows a list of books covering general seismology, basic maths, and clas- 
sic approaches such as the reflectivity method, ray theory, and normal mode 
solutions. 


Kennett (1983) presents the theory for the calculation of synthetic seismo- 
grams in stratified media (e.g. reflectivity method). 


Cerveny (2001) is the classic textbook on seismic ray theory. 


Nolet (2008) provides a comprehensive introduction to seismic tomogra- 
phy, including the basic theory of wave propagation using ray and Born 
approximations. 


Dahlen and Tromp (1998) is an exhaustive textbook on global wave 
propagation using normal mode techniques. 


Chapman (2004) presents a comprehensive introduction to the propaga- 
tion of high-frequency body waves in elastodynamics. 


Snieder (2015) provides an entertaining tour of the mathematical knowl- 
edge and techniques that are needed by students across the physical 
sciences, with very illustrative examples. 


The Encyclopedia of solid Earth geophysics (ed. Gupta, 2011) covers many 
fundamental research problems in seismology and also contains several 
review papers on methods discussed in this volume. 


The Treatise on Geophysics (Schubert, 2015) is maybe the most com- 
plete collection of review articles in solid Earth geophysics. Volume 1 (Deep 
Earth Seismology) contains a collection of articles on various approaches 
to the seismic forward problem. 


The International Handbook of Earthquake and Engineering Seismol- 
ogy (Lee et al., 2002) contains about 50 review-style articles on many 
fundamental aspects of seismology. 


The New Manual of Seismological Observatory Practice (ed. Bormann, 
2012) is an open-access collection of articles on many aspects of seis- 
mic wave propagation, seismic sources, instrumentation, networks, and 
processing. 
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Seismic Waves and Sources 


So, you want to calculate synthetic seismograms, or dive into a seismological 
research problem with a complicated Earth model or earthquake source mech- 
anism? Most likely you will be making use of a community code to get going, or 
a program that is handed to you by other researchers or your supervisor. There 
will be a parameter file that allows you to change the set-up, some of the entries 
being obvious, some less so. It is usually quite easy to obtain the first results. But 
how do you check that the results are correct? Is every wiggle really accurate in 
travel time and amplitude? Is your Earth model correctly implemented? Are the 
sources and receivers correctly positioned? 

These questions are difficult to answer for complex models and we will 
later devise some strategies on how to overcome this. When you start using 
any solver for seismic wave propagation (well-tested or not) it is wise to be- 
gin with a simple earthquake source and Earth model (e.g., a homogeneous 
half-space) and to check whether the seismograms make sense. This chapter 
aims at providing some hints as to what you should expect for simple media, 
and what fundamental strategies are available to help you obtain correct re- 
sults. This will require a basic understanding of seismic sources and elastic wave 
propagation. 

This introduction cannot replace a textbook on seismology (see recommen- 
dations at the end of this chapter). The topics are introduced with a view 
to their relevance for seismic simulation problems. Key issues are: (1) What 
are the governing equations for elastic wave propagation? (2) What phe- 
nomena do we expect in simple Earth models? (3) What boundary condi- 
tions apply? (4) How are seismic sources described? and (5) What are the 
consequences of linearity and reciprocity of the elastic wave equation for 
simulations? 

Let us introduce some fundamental concepts relevant for wave simulations 
by looking at real global seismic wavefield observations. The simple principles 
discussed in what follows can easily be adapted to other scales (e.g. reservoirs, 
sedimentary basins, volcanoes, rock samples). In Fig. 2.1 the vertical component 
of a broadband velocity seismogram is shown representing the ground motion at 
station WET in southern Germany following the devastating Sumatra-Andaman 
earthquake with moment magnitude M9.1 that occurred on 26 December 2004. 
Before we start discussing these observations, let us introduce one of the most 
important relations that you need in order to plan, check, and understand 
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Fig. 2.1 Vertical component ground ve- 
locity seismogram of the M9.1 Sumatra- 
Andaman earthquake of 26 December 
2004, recorded in southern Germany 
(WET) with an STS2_ seismometer. 
Left: Original broadband seismogram. 
Middle: Low-pass filtered seismogram 
with corner period 40 s. Right: Low- 
pass filtered with corner period 100 s. 
Amplitudes are normalized. 
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seismic simulation results: the connection between wavenumber & and angular 
frequency w (or frequency f), or wavelength 4 and period T: 
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where c is phase velocity. Let us play around with this simple relation given the 
seismograms in Fig. 2.1. 
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First, by visual inspection of the original broadband record we can appreci- 
ate that a wide range of frequencies make up the entire seismogram. The initial 
high-frequency part consists of body waves, mainly compressional P- and shear 
SV-wave energy (shear-wave polarized in the vertical plane) with a maximum 
frequency of around 1 Hz. What does this imply for the spatial wavelengths in- 
side the Earth? The Preliminary Reference Earth Model (PREM) (Dziewonski 
and Anderson, 1981) shown in Fig. 2.2 tells us that near the Earth’s surface P- 
velocities are around 6 km/s and shear velocities close to 3 km/s Ggnoring the 
oceans). That means the shortest wavelength d.jn;,. 1s likely to be around 3 km. 
As velocity increases with depth (at least down to the core—mantle boundary), 
wavelengths will almost always be longer. This is important as in all the numer- 
ical solvers discussed later we discretize the Earth’s interior with grid points or 
elemental cells, and need to make sure that the smallest wavelength is accurately 
sampled. 

Sampling a wave-like function is characterized by a concept called the number 
of grid points per wavelength, which will play a central role in all numerical meth- 
ods (see Fig. 2.3). It is instructive to calculate how many cells (or points) you 
would need for the entire Earth if you were to discretize her regularly with cubic 
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Fig. 2.2 Global reference model. The 
Preliminary Reference Earth Model 
(PREM) (Dztewonski and Anderson, 
1981) shown here in tts isotropic form 
with seismic velocities and density in a 


spherically symmetric Earth. 


Fig. 2.3 Sampling waves. Sinusoidal 
wave function sampled with 10 grid 
points per wavelength (more precisely 
10 segments or elements per wave- 
length, and 11 grid points, including the 
boundaries). 


16 Seismic Waves and Sources 


Fig. 2.4 Triangular-element mesh for 
global seismic velocity structure. The 
mesh is designed such that the num- 
ber of points per wavelength for a given 
frequency is approximately constant ev- 
erywhere (Kdser and Dumbser, 2006). 


' In the wave-propagation context the 
term dispersion denotes the frequency 
(or wavenumber) dependence of seismic 
velocities. 


elements, and—for example—require that the shortest wavelength is sampled 
with at least 10 points. Let’s use the numbers above: 


e The shortest wavelength is 3 km and we require it to be sampled with at 
least 10 points. That means the maximum grid spacing (or cubic element 
side) should be h = 300 m. 


e As we are in 3D the volume of our cube element V, = h’. 


e The volume Vz of the entire Earth with radius rg = 6,371 km can be 
calculated as Vg = 4/377}. 


e The number of required elements is Vz/V, = 4 x 101°. 


e One space-dependent field (e.g. density, displacement, Lamé parameters) 
at double precision (8 bytes per number) would thus require 320 TBytes. 


Wow! Clearly, even on today’s supercomputers this is a tremendous challenge, 
and besides, we would oversample large parts of the Earth’s interior, where ve- 
locities reach 13 km/s (how many points per wavelength would that correspond 
to?). This suggests that for problems with strongly varying velocity models regu- 
lar grids do not make sense. An example of a spherical mesh (reduced to 2D) with 
point- or (element-) density adapted to the velocity model is shown in Fig. 2.4. In 
principle, meshes for heterogeneous media can be designed such that the number 
of grid points per wavelength is approximately constant throughout the model. 
More on this later. 

Let us have a look at the spectral content of our observations (see Fig. 2.5). 
The largest amplitudes appear between the frequencies 0.02—0.06 Hz, character- 
istic of very large earthquakes. The low-pass filtered spectra (and seismograms) 
reveal that substantial energy with high signal-to-noise ratio is present at low 
frequencies. 

These dominant amplitudes are Rayleigh-type surface waves that start around 
t= 2500 s, lasting for several hundreds of seconds. We note strong physical dis- 
persion.’ Long-period surface waves arrive much earlier than short-period surface 
waves. The period range of Rayleigh waves spans an interval from about tens to 
hundreds of seconds. 

Performing a similar calculation as presented above for surface waves with 
appropriate phase velocities (e.g. for cut-off period T= 40 s) we obtain model 
sizes that fit on today’s common institutional computer clusters. Note that— 
depending on epicentral distance—we would only have to discretize the upper 
mantle and crust. An important point is that surface waves are a consequence 
of the stress-free boundary condition. This implies that when using numerical 
methods and when surface waves are the target, this boundary condition should 
be implemented with high accuracy. 

This illustration with observed seismograms serves to show that (1) target 
frequencies, (2) seismic phases to be modelled (body waves or surface waves), 
and (3) seismic velocity heterogeneity are dominant factors on how Earth models 
need to be discretized. Seismograms are affected by both source and structure. 


In the following sections we will present the relevant equations and theoretical 
concepts used in computational seismology, some of which we will discretize and 
approximate with numerical methods later on. This is kept at a very fundamental 
level. For further details please refer to the general seismology textbooks listed at 
the end of this chapter. 


2.1 Elastic wave equations 


In the following we will present several forms of the elastic wave equation, starting 
with the complete set of equations in 3D, ending with the specific form in 1D, 
which we will use as the central equation to be solved employing the various 
numerical techniques in the subsequent chapters. Throughout the volume we 
alternate between assuming space-time dependencies implicitly or stating them 
explicitly. Often this is a matter of space and/or clarity. The reader is encouraged 
to return to this chapter in case the dependencies are not clear. 

Let us introduce the key players in our problem: our unknown field that we 
would like to determine is either the displacement field u;(x, 2) or its time deriva- 
tive the velocity field v;(x, t) = 0,u;(x,£). The displacement field determines the 
strain field €,;(x, ¢), that in turn is proportional to the stress field o,(x, ¢) with the 
general fourth-order tensor of elastic constants cj,j(x) as proportionality factors. 
Elastic constants and space-dependent density p(x) constitute the geophysical 
properties of an elastic Earth model. The seismic sources are characterized either 
by the seismic moment tensor Mj(x,1) or by the volumetric forces /;(x, 1). In 
summary, the dependencies are 


uj; —> u;(x,t) t= 1,233 
vi —> U;(x,t) z= 1,2,3 
Oj > ot) 19 = 15253 
€j > e721) 1,7 = 1,2,3 
p > pt) 

Cikl —> Ciypl CX) 45.45 Rb = 1,2,3 
fie 2 £30 4-> 1,253 
My > My, 1) 4,9 = 1,253; 


where 7,j,k,/ are determined by the dimensionality of the problem (1D, 2D, 
or 3D). 

These players make up the elastic wave equation which we first show in the 
displacement form for isotropic media: 


paru; = O(oy +My) +f 
Depp y +2 Lei (2.2) 


Oy 


a 
&a = 5 (Optty + Otte), 
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Fig. 2.5 Normalized amplitude spec- 
trum of the original broadband (BB) 
vertical-component velocity seismogram 
of Fig. 2.1 (black line) and the spec- 
tra after filtering with cut-off period 
T, = 40 s (dashed line), and T, = 100 
s (dotted line). The spectra are shown for 
the low-frequency part below 0.1 Hz. 
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Fig. 2.6 The 1D elastic wave equation 


for transverse motion 1s descriptive of 
vibration problems of strings. 


? Summation over the index or indices 
that appear twice in a term, e.g. €% = €1; + 
€y) + €33. Indeed, Einstein introduced this 
compact notation in his treatise on general 
relativity in 1916. 

3 You might wonder why we do not 
use the acoustic equation to introduce the 
various numerical methods. The point is 
that the elastic version (Eq. 2.4) contains 
derivatives of the elastic parameters lead- 
ing to specific features (e.g. grid stagger- 
ing) common in today’s numerical solvers. 


where the Einstein summation convention? applies. (x) and w(x) are the Lamé 
parameters, the latter being the shear modulus; 4, is the Kronecker delta. We will 
discuss anisotropic elastic parameters in the section on rheologies. 

In principle, these nested relations can be merged into one equation. As an 
example, we show this for one component (i= 2, i.e. the y-component) 


pd; uy = dx [WAxty + Iytx) + Myx] 
+ dy [+ 2p) dyuy + ACOxUy + OzUz) + Myy| 
+ 0, [W(dztty + Oyuz) + Myz] 


t+ fy 


(2.3) 


with equivalent expressions for the other two motion components (see exercises). 
We anticipate that the temporal and spatial derivatives in these equations cannot 
be solved analytically in the general case, which is the reason why we have to em- 
ploy numerical methods. As the focus of this volume is to introduce the concepts 
of a variety of numerical methods for this equation, we keep the specific form of 
the elastic wave equation as simple as possible. Therefore, we reduce this equation 
to a 1D problem by assuming (1) propagation in x-direction, (2) displacement 
perpendicular to the propagation direction (transverse motion in y-direction), 
and (3) initialization of wave propagation either by external forcing f, or by an 
appropriate initial condition. With these assumptions applied to Eq. 2.3, and an 
external force term f,, we obtain 

Pdeuy = Ix(MAxuy) + fy. (2.4) 
This equation is descriptive of transversely polarized elastic waves propagating 
in x-direction (e.g. motion of a string, see Fig. 2.6). Analytical solutions to this 
equation that are extremely useful for validating numerical results are presented 
in the following sections. 

There is another form of wave equation that can be derived from Eq. 2.2, 
assuming constant density and vanishing shear modulus jw. This so-called acoustic 
wave equation describes the propagation of compressional waves in media like flu- 
ids and gases and is still widely used today to describe P-wave fields, for example 
in exploration problems. The acoustic wave equation reads 

a7p = Apts, (2.5) 
where p(x, f) is the unknown pressure field, c(x) is acoustic velocity, s(x, f) is a 
pressure source field, and A = V? = [02 + ay + 02] is the Laplace-operator.? 

Yet another form of the elastic wave equation that plays an important role in 
computational seismology is the so-called velocity-stress formulation. We replace 
the displacement field u(x,t) by its time derivative, the velocity field u(x,t) = 
0,u(x, t), and obtain a set of coupled equations 
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0,0; = 0; (Oy + Mj) +f; 
0,055 = AOE RRO; oe 2 Oey (2.6) 


1 
Oe; = 5 (ain; + dj). 


It is important to note that this is a first-order partial-differential equation of hy- 
perbolic form (more on this later) and both velocities and stresses are considered 
unknown. Another point is that we are not directly taking derivatives of the elastic 
parameters. Most current finite-difference approaches are based on solutions to 
these coupled equations. With the mapping v = v,, f = f,, and o = oy, (and 
an external force only) this reduces to the following system of equations in 1D, 
which we will use extensively in later chapters 


0,v O.0 +f 


0,0 = [Ldxv. (2.7) 


For the sake of simplicity we will restrict ourselves to elastic isotropic media when 
discussing various numerical solutions to the above equations. In the following we 
will take a purely mathematical approach and discuss the few analytical solutions 
that are available for the most basic wave equations. These solutions are extremely 
useful for checking the numerical solutions to be developed later for the case of 
homogeneous media. 


2.2 Analytical solutions: scalar wave 
equation 


In this section we present analytical solutions for the scalar (acoustic) wave 
equation 


ar p(x, t)- e Ap(x; t) = s(x 2) (2.8) 


assuming constant velocity c and infinite space. Note that in 1D and 2D this equa- 
tion is mathematically equivalent to the problem of SH wave propagation (i.e. 
shear waves polarized perpendicular to the plane through source and receiver). 
In 3D it is (only) descriptive of pressure (sound) waves. There are several ways 
to initiate propagating waves. The most simple case is when there is no external 
source, thus s(x, tf) = 0, but an initial pressure (or displacement) field exists at 
t = 0 without time variation 


P(x t = 0) = po(x) 
ap(x,t = 0) = 0. (2.9) 


The solution to this problem is 
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Amplitude 


Fig. 2.7 Illustration of the analytical 
solution for a pressure (displacement) ini- 
tial condition po(x) (dashed line). The 
solution p(x, t) corresponds to the initial 


waveform propagating in both directions 
undisturbed with velocity c (solid line). 


dpe (x) 


Fig. 2.8 Boxcar functions for dx € 
[1...20] as an example of a function 
that converges to a 5-function. As their 
width goes to zero their amplitude goes 
to infinity. See text for details. 


1 1 
PM 1) = Zpolct—x) + s pole +x), (2.10) 


corresponding to the waveform po(x) being advected undisturbed (with half the 
initial amplitude) in positive and negative x-direction. This is illustrated in Fig. 2.7 
with a waveform of Gaussian shape. 

Analytical solutions for inhomogeneous partial differential equations (i.e. with 
non-zero source terms) are usually developed using the concept of Green’s 
functions G(x, t; X90, %). Green’s functions are the solutions to the specific par- 
tial differential equations for 5-functions as source terms evaluated at (x, 7) and 
activated at (xo, fo). Thus we seek solutions to 


ar Gt, 1; X05 to) > c’ AG(x, 1; Xo> to) = b(x- X0)d(t- to)» (2. 1 1) 


where A is the Laplace operator. We recall the definition of the 6-function as a 
generalized function with 


so =| et (2.12) 
and 
[ scoux =1, [- rescoa = f(0). (2.13) 


When comparing numerical with analytical solutions the functions that, in the 
limit, lead to the 5-function will become very important. An example is the boxcar 
function 


sc) =| a |x| <dx/2 eas 
0 elsewhere, 
fulfilling these properties as dx — 0. These functions are used to properly scale 
the source terms to obtain correct absolute amplitudes. The convergence of the 
boxcar function thus defined is illustrated in Fig. 2.8. 

To describe analytical solutions for the acoustic wave equation we also make 
use of the unit step function, also known as the Heaviside function, defined as 


_JOx<0 


H(x) or 


(2.15) 


The Heaviside function is the integral of the 6-function (and vice versa the 
6-function is defined as the derivative of the Heaviside function). Omitting their 
derivation (see references at the end of this chapter) the Green’s functions for 
Eq. 2.11 are presented in Table 2.1 for any number of spatial dimensions. These 
analytical solutions are illustrated in Fig. 2.9. It is worth spending some time 
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Table 2.1 Green’s functions for the inhomogeneous acoustic wave equation after 
Rienstra and Hirschberg (2016). 
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discussing these extremely important results. In the 1D case the Green’s function 
is proportional to a Heaviside function. As the response to an arbitrary source 
time function can be obtained by convolution this implies that the propagating 
waveform is the integral of the source time function. This is illustrated in Fig. 2.9 
(bottom left) where the response is shown for a source time function with a first 
derivative of a Gaussian. 

A special situation occurs in 2D. An impulsive source leads to a waveform with 
a coda that decreases with time. This is a consequence of the fact that the source 
actually is a line source (see discussion in the next chapter). From a computational 
point of view this is extremely important. Numerical solutions in 2D Cartesian 
coordinates cannot directly be compared to observations in which we usually have 
point sources. 

Finally, in 3D the Green’s function is proportional to a 6-function scaled by 
the distance from the source (geometrical spreading). Beware of the practical sig- 
nificance of this result for sound wave propagation! Ideally, whatever your lecturer 


Fig. 2.9 Analytical solutions to the 
scalar wave equation. Top: Green’s 
functions in 1D, 2D, and 3D. Bottom: 
Green’s functions obtained after convolu- 
tion with the first derivative of a Gaus- 
sian with 1 Hz dominant frequency (see 
text for details). Note that the source time 
function ts centred around t = 0. 


22 Seismic Waves and Sources 


+ Furthermore, imagine a different kind 
of Green’s function. I don’t want to know 
what the integral of Beethoven’s Fifth 
Symphony would sound like! 


> Here the term attenuation is clear and 
related to the loss of energy due to in- 
trinsic processes. Sometimes (e.g. in earth- 
quake engineering) the term attenuation 
is also used to describe the decay of am- 
plitudes with distance due to geometrical 
spreading. 

° The strain energy function is defined 
by W = 1/20;€; with summation over 
both indices. 

7 Note that we owe the high Q inside our 
Earth the fact that we know so much about 
its interior from seismic tomography. 


tells you should physically arrive undisturbed at your ears (unfortunately the same 
does not generally apply to the meaning of the message carried by the sound 
waves) .4 

To recover these results with numerical simulations for arbitrary spatio- 
temporal sources requires some care as to the scaling of point sources in the 
discrete world. This is discussed in more detail in connection with the specific 
numerical methods. 


2.3 Rheologies 


Non-isotropic rheologies are important for realistic applications and are usually 
covered in current solvers. Therefore we briefly discuss them in this section. The 
term rheology actually originates from the description of flowing material. In the 
context of solid material it describes how deformations (here: strains) are related 
to forces (here: stresses). In the following we briefly discuss (1) viscoelastic ma- 
terial, (2) anisotropic material, and (3) poro-elasticity. It is difficult to prioritize 
these effects in terms of their relevance to modelling real observations. However, 
the first two are almost equally important on all spatial scales. As we will not use 
these rheologies in our numerical approximations to be developed later, they are 
mentioned for completion. The adaptation of rheologies to the specific numerical 
solutions is sometimes easy (anisotropy), and sometimes not so easy (attenuation 
or poroelasticity). The interested reader is referred to the references given for 
each numerical method. 


2.3.1 Viscoelasticity and attenuation 


There is no such thing as perfect elasticity in nature. Thus, seismic wave fields are 
permanently losing energy, for example due to micro-damage created during the 
passage of waves or friction-induced conversion into heat. Such phenomena lead 
to intrinsic attenuation,’ an energy loss that is described using the letter O (from 
quality factor). 

Q is a dimensionless quantity and is defined by the fractional energy loss per 
cycle 


1 AE 


where —ABE is the energy loss per cycle and E is the peak strain energy.® QO is usu- 
ally considered independent of frequency, low values (e.g. Q = 10 in a reservoir) 
corresponding to strong attenuation, high values (e.g. Q = 600 in the mantle) to 
small attenuation.’ 

The decay A(x) for initial amplitude Ap as a function of propagation distance 
x for a monochromatic plane wave of frequency @, velocity c, and quality factor 
Q is given by 


Wx 


A(x) = Ape 2¢Q. (emay 
This equation is illustrated in Fig. 2.10 for frequency w = 1 Hz, a variety of Q 
factors, and propagation distances up to 100 km. 

Note that constant, frequency-independent Q has important consequences. 
As Q describes the energy loss per cycle, this implies that for a given propa- 
gation distance the high-frequency (short wavelength) part of the wavefield is 
substantially more attenuated than the low-frequency part, progressively altering 
the waveform. This effect explains the diminishing abundance of high frequencies 
away from a seismic source and determines the upper frequency limit in global 
seismograms, as seen in the introduction. 


2.3.2 Seismic anisotropy 


Today it is widely accepted that most of the Earth’s interior—from crust to inner 
core—shows anisotropic elastic behaviour. Most local observations above active 
faults, upper mantle body and surface waves, as well as body waves in the inner 
core cannot be adequately explained without some form of anisotropic symmetry 
system. 

In the most general case, the stress oj of an anisotropic body is related to the 
deformation by 
15J5R,l = 1, 2,3, 


OF = Cyr Ekl>s (2.18) 


where cjg; is a fourth-order tensor with 81 elastic constants (the Einstein sum- 
mation convention applies). Due to the symmetry conditions of the elastic tensor 
Ciel = Cjikt = Cray and further thermodynamical arguments it is possible to reduce 
this tensor to a 6 x 6 matrix with the following conventions that 11 — 1, 22 > 2, 
33 — 3,12 > 6, 32 > 4, and 13 — 5 (also known as the Voigt notation). This 
symmetric matrix is known as cp, and has in general 21 independent elements. 


The number of independent elements of cj, is reduced if the medium belongs to 
a certain symmetry system. 

For example, the most common system used for seismic wave propagation 
is hexagonal symmetry. In that case—assuming coordinate axes aligned with the 
axes of symmetry—the matrix cp, reads 


Cy, Cy2 C13 0 0 0) 
C12 C41 413 0 O 0 
C13 :C13:«€33 0 0 0 
= 2.19 
“10 0 0c 0 0 em) 
0 0 0 0 C44 0 
0 0 0 0 0 “542 
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Fig. 2.10 Seismic attenuation. The 
quality factor Q is used to describe 
seismic attenuation (see text). In this 
graph 
attenuation only 1s shown as a function 


the amplitude decay due to 


of propagation distance for an initial 
unity amplitude. The curves illustrate 
the behaviour for Q = 10 (lowest curve, 
strongest attenuation) to Q = 100 (top 
attenuation) at a 


curve, weakest 


frequency of 1 Hz. 
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Fig. 2.11 Shear-wave splitting  (bi- 
refringence). In homogeneous anisotropic 
material in general there are two quasi- 
shear waves qgS1 and qS2 propagating 
with different velocities and orthog- 
onal polarizations. This leads to the 
well-known shear-wave splitting phe- 
nomenon. As the orientation of aligned 
small-scale heterogeneities (e.g. cracks, 
filled pore space) is usually related 
to the stress field, the observation of 
shear-wave polarizations may carry 
important information about the rock 
mass at depth. 


with at most five independent elastic parameters. For arbitrary orientation of the 
symmetry axis, this matrix needs to be rotated and is full. 

The consequences of material anisotropy for seismic wave propagation are 
manifold. First, akin to bi-refringence known from crystal optics, shear waves 
split into to so-called quasi-shear waves gS; and gS2 that propagate with dif- 
ferent velocities and orthogonal polarizations. This is schematically illustrated 
in Fig. 2.11. There are no longer pure compressional waves polarized in lon- 
gitudinal direction. They are replaced by quasi-P (qP) waves. Furthermore, 
wavefronts in homogeneous media are no longer of spherical shape. This is il- 
lustrated in an example of a TI (transversely isotropic) medium corresponding 
to a hexagonal symmetry system. This is a common anisotropic system that 
results from fine horizontal layering. In this case the axis of symmetry is the ver- 
tical axis, and quasi-shear waves are decoupled into pure qSV and qSH waves 
polarized in vertical and horizontal direction, respectively. In TI media phase 
velocities are azimuthally isotropic. However, in any vertical plane the phase ve- 
locities vary with direction as indicated in Fig. 2.12. The figure was obtained 
using the weak anisotropy approximation introduced by Thomsen (1986). He 
introduced three positive parameters €, y, and 5. The parameter € can be in- 
terpreted as the fraction of qP anisotropy, and y the corresponding fraction 
for the qSH anisotropy. The phase velocity variations can be obtained by the 
relations 


Ugp (@) Upo (1 + 6 sin?(0) cos”(@) + € sin* (0)) 


2 
v,sv (8) = v9 (1+ BED 625) sin?(@) cos*(@)) (2.20) 
Uso 


vgsH(O) = Vso (1+ ysin’@)), 


where 6 = 0 corresponds to vertical direction. The example in Fig. 2.12 was ob- 
tained with upp = 5 km/s, vso = 2.89 km/s, € = 0.1, 6 = 0, and y = 0.1. These 
values are probably beyond the limit of the approximation and are used here 
merely to illustrate the principle. Note that the qS-phase velocity surfaces touch 
each other, in which case the polarization remains undefined (as in the isotropic 
case in any direction). These directions are called singularities in anisotropy 
terminology. 

There are many ways to initiate the elastic constants for anisotropic ma- 
terial. A comprehensive review of anisotropic symmetry systems is given in 
Crampin (1984). Anisotropic wave propagation simulated with numerical meth- 
ods is best verified against quasi-analytical solutions, such as programs based on 
the anisotropic reflectivity method (Booth and Crampin, 1983). 


2.3.3 Poroelasticity 


When the rock mass is strongly fractured, and pore space (partially) filled with 
liquids (water, oil) or gases (air, methane), the elastic equations are no longer 


sufficient. In this case the stress-strain relation is replaced by an alternative one 
that was developed using continuum mechanics. The most important effect 
is that in homogeneous poro-elastic media there are two types of compres- 
sional waves (a fast and a slow one) in addition to the classic shear wave (see 
Fig. 2.13). This result was controversial until such waves were first observed in the 
eighties. Obviously, poroelastic effects play an important role in reservoir prob- 
lems, geothermal projects, and volcanology. Some community wave-propagation 
solvers have optional poroelasticity modules. The most extensive description of 
poroelasticity can be found in Carcione (2014). 


2.4 Boundary and initial conditions 


Elastic wave propagation is governed by partial differential equations the solu- 
tion of which depends on the definition of initial and boundary conditions. In 
numerical schemes some of these conditions are trivially implemented (like ini- 
tial conditions), some extremely hard to fulfill (ike absorbing boundaries). While 
not a focus of this introductory text, aspects of boundary conditions are briefly 
discussed for each numerical method later on. Here, we summarize them from a 
physical point of view. 


2.4.1 Initial conditions 


In most cases, everything is at rest when we start. As our solution fields are either 
displacement w(x, t) or velocity v(x, t) this can be expressed as 


uj (X, t= 0) = 
vj(x,t=0) = (2.21) 
We also assume that moments and forces are activated at time t > 0 thus 
fi t<0) = 
My(x,t < 0) = (2.22) 


As indicated in the section with analytical solutions waves can also be generated 
by a space-dependent initial condition in displacements, velocities, or stresses, de- 
pending on the specific wave equation. Such initial conditions my be described as 


Uz (x, t= 0) 
oy (x,t = 0) 


v} (x) 


0 
OF (x), 


(2.23) 


here with velocities or stresses, respectively. A general analytical solution with this 
initial condition is developed for the velocity—stress formulation in the chapter on 
the finite-volume method. 
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210 150 
180 


Fig. 2.12 Seismic anisotropy. Phase ve- 
locity variations of qR. gS\y and qSH 
waves (km/s) in the x-z plane for a 
hexagonal (transversely isotropic) sym- 
metry system. The axis of symmetry 1s 
the vertical axis (0 = 0). See text for 


details. 


Fig. 2.13 Poroelastic wave propaga- 


tion. Snapshot of wave propagation in 
an anisotropic, poroelastic model of a 
sandstone material (horizontal motion 
component). The most important effect of 
a poroelastic material 1s the presence of 
a slow P-wave (centre of image). Figure 
from de la Puente et al. (2008). 


8 Poroelasticity goes back primarily to 
the work of M. A. Biot (1905-1985) in 
the fifties. During that time he worked as 
an independent consultant for the Shell 
Company. 
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Fig. 2.14 Free-surface boundary condi- 
tion. In the presence of surface topogra- 
phy, the free-surface boundary condition 
involves the local normal direction nj to 
the surface. 


f, 


Z 
| free surface 


elastic half space 
Vp Vs P 


Fig. 2.15 Illustration of a geometrical 
set-up for Lamb’s problem. A vertical 
point force fz 1s activated and recorded 
at some distance by a receiver recording 
ground motion components u;. 


° The distinguished British mathemati- 
cian Sir Horace Lamb (1849-1934) 
worked in Manchester, Cambridge, and 
Adelaide (Australia), mainly on problems 
in fluid mechanics and elasticity. 


Fig. 2.16 Analytical solution to Lamb’s 
problem (vertical displacement). The pa- 
rameters of the half-space are vp = 
8 km/s, v, = 4.62 km/s, and p = 
3.3 kglm?. The vertical force is input 
with 10’ dyn and the receiver distance 
is 10 km. Left: Solution to a step-like 
source time function. Right: Solution to 
a Gaussian-shaped source time function 
with dominant frequency of 5 Hz. 


2.4.2 Free surface and Lamb’s problem 


The question of which forces act in a specific direction n; given a space-dependent 
stress field oj leads to the concept of tractions 4; defined as 


ti = OF Nj» (2.24) 
where n; is normalized (i.e. |n| = 1). At the Earth’s free surface the tractions 
perpendicular to it are zero. This is called the free surface boundary condition, 
a condition that leads to the existence of surface waves—a dominant feature in 
regional and global broadband seismograms. Assuming the z-direction pointing 
upwards n = [0, 0, 1], this condition leads to 


(2.25) 


As indicated in Fig. 2.14, this boundary condition depends on the local nor- 
mal direction in the case of complex surface topography. With some numerical 
methods this is hard to implement accurately. Thus the involvement of a rugged 
free surface becomes an important aspect when choosing the right solution 
strategy. 

The so-called Lamb’s problem? is an important benchmark for numerical 
solvers that incorporate a free-surface boundary. An analytical solution for the 
general problem of a point volumetric force acting inside or at the boundary 
of an elastic half-space was presented by Johnson (1974). The geometrical set- 
up is illustrated in Fig. 2.15 and an example is shown in Fig. 2.16. As will 
be discussed in more detail below, the stress-free surface leads to the presence 
of a Rayleigh-type surface wave. The seismograms show a faint P-wave ar- 
rival from propagation along the surface, followed by a Rayleigh surface wave 
with substantially larger amplitude. These results were obtained with the original 
Fortran code by Lane Johnson, and some of the results were reproduced from 
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Johnson (1974). The analytical solution is part of the supplementary electronic 
material. 


2.4.3 Internal boundaries 


Inside the Earth we frequently have rapid changes of material parameters, often 
real material discontinuities (e.g. layers). Examples are stratifications in sedi- 
ments, the crust-mantle (Moho) discontinuity, the core-mantle boundary, or the 
inner-core boundary. Due to lithostatic pressure such discontinuities can in most 
cases be treated as perfectly welded interfaces. In this case, displacement and trac- 
tions are continuous. At the interface between two media 1 and 2 this condition 
can be expressed as 
1 y) 
on ) = on ) (2.26) 
i) =u. 


This is further illustrated in Fig. 2.17 with a curved interface. The good news is 
that this condition does not have to be explicitly implemented in all the numerical 
solution strategies we discuss. However, an important question that is still under 
debate is whether the actual location of such interfaces has to be honoured by a 
computational mesh, which might be tricky for complex structures (e.g. a sed- 
imentary basin). On the other hand, recent work in the field of homogenization 
suggests that an efficient way of dealing with this problem might be to replace the 
discontinuity by a smooth structure (see Chapter 11). 

There is one situation that might have to be treated explicitly and this is a fluid— 
solid boundary. Indeed, this is the approach taken by the specfem3d developers for 
global wave propagation to properly simulate the wave effects of the core—mantle 
boundary (e.g. Komatitsch et al., 20000). 


2.4.4 Absorbing boundary conditions 


Most seismological applications (global wave propagation is an exception) re- 
quire the simulation of wave fields in limited areas (e.g. reservoirs, continents, 
fault zones, volcanoes). This implies that at some point seismic waves emanat- 
ing from a source will hit the boundary of a computational domain. Modelling 
reality would imply that waves (except at the free surface) are passing undis- 
turbed through that boundary which can then be considered as absorbing. It 
turns out that this is a really tough problem for most numerical methods and 
is still the topic of ongoing research. Most approaches only solve part of the 
problem, absorb only in a certain frequency band or propagation direction, 
and some lead to additional computational instabilities. The art of efficient 
absorbing boundaries is a field of its own and therefore not covered in this in- 
troductory text. Suggestions for further reading are given at the end of this 
chapter. 
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Fig. 2.17 Material interface. Across an 


interface with changing geophysical pa- 
rameters displacement components uj; 
and traction t; = ojnj are continu- 
ous. This is called a welded material 
interface. 
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Table 2.2 Examples of crustal seismic 
velocities and elastic parameters. 


Parameter Value 

Up 6,000 m/s 
Us 3,464 m/s 
p 2,500 kg/m? 
bb 3 x10!° Pa 
ny 3 x10!° Pa 


Fig. 2.18 Plane waves. Illustration of 
planes of constant phase (e.g. pressure, 
displacement) travelling with constant 
speed in the direction of wavenumber 
vector R. 


2.5 Fundamental solutions 


What are the basic consequences of the equations of elastic motion? 


2.5.1 Body waves 


First of all, in the absence of any sources it turns out that there are two body- 
wave types that propagate in infinite homogeneous elastic media. These are 
compressional P-waves and transversely polarized shear (S-) waves with velocities 


[n+2 
Up = Powe fF. (2.27) 
p p 


When initializing seismic simulations, often the Lamé parameters need to be 


given. They are calculated from seismic velocities as 
A= pv, —2v?) > b= pu. (2.28) 


Examples for values in the Earth’s crust are given in Table 2.2. A useful assump- 
tion for crustal velocities is that v, = /3v, in which case 4 = jw. An important 
concept that we will extensively use to understand numerical approximations to 
the elastic wave equations is the mathematical description of plane waves. The 
most simple form for a scalar (e.g. acoustic) plane sinusoidal wave in 3D is 


D(X, 1) = po sin(kx — wt) = po sin(Ryx + Ryy + kz —- Ot) (2.29) 


where p(x, ft) is the pressure at position x, po is the maximum amplitude, k is the 
wavenumber vector pointing in the direction of propagation, and w is the angular 
frequency (see Fig. 2.18). The propagation velocity c is given by 

7) Xr 


= (2.30) 


where |k| is the modulus of the wavenumber vector, A = 27 | k| is the wavelength, 
and period T = 2z/f. 

For elastic waves with three motion components, using the common complex 
exponential form, plane waves can be written as 


u(x,t) = Ad®9, (2.31) 


where A is the polarization vector, with A| |k for P-waves and A L k for S-waves. 


2.5.2 Gradient, divergence, curl 


At this point it is useful to muse about the vector field operators gradient V, diver- 
gence Ve, and curl Vx, and their connections to (simulated) elastic wavefields. 
The gradient of an elastic wave field is defined as 


Vu(x, f) = 0;u;(x, t). (2.32) 


The separation of this tensor into symmetric and anti-symmetric parts leads to 
the definition of the symmetric deformation (strain) tensor €, 


€j(% 0) = 5 (ity t) + 0;u;(x, t)) (2.33) 
that is an integral part of the elastic wave equation. Note that today it is pos- 
sible to observe strain components (Fig. 2.19), and/or derive the strain of a 
seismic wavefield from seismic array measurements. Strain is a linear combina- 
tion of space-derivatives of the seismic wavefield. Therefore it can be derived by 
finite-differencing array observations of appropriate dimensions. 

In isotropic media, elastic wavefields can be separated into a divergence-free 
part (S-waves) applying the curl to the wavefield 


OyUz — OzUy 
=V xu=-— ] 0,u,—-0,u, (2.34) 
Ox Uy — DyUy 


(implicit space-time dependence) and a curl-free part (P-waves) applying the 
divergence operator 

V el = Oyuy + DyUy + Oguz = Ej (2.35) 
where €;; is the trace of the strain tensor corresponding to volumetric change (the 
Einstein summation rule applies). When simulating seismic wavefields, usually 
the partial derivatives of the displacement fields are calculated anyway. Therefore, 
it is easy to output curl or divergence fields. For example, they can be used to 
separate P- and S-wave fields (e.g. for snapshots with different colouring of P 
and S). 

The combined analysis of displacement field, and its strain components, diver- 
gence, and curl offers interesting opportunities for the seismic inverse problem for 
sources and structures, provided that they can be observed with sufficient accu- 
racy. See the exercises at the end of the chapter for some theoretical examples. 
Rotational ground motions have been observed relatively recently using ring laser 
systems (Fig. 2.20, see Igel et al., 2005). 


2.5.3 Surface waves 


The presence of a free surface with appropriate stress-free conditions (see above) 
leads to an elastic half-space when the medium below the surface has homoge- 
neous properties. Covering the theoretical background goes beyond the quick 
introduction this text aims for (see literature at the end of this chapter). In the 
following we review some of the main properties of surface waves that you are 
likely to encounter when simulating waves in 3D media with receivers at or near 
the surface. 
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Fig. 2.19 Strainmeter at Pifion Flat 
Observatory, California. One (of three) 
leg(s) 1s shown. A light beam 1s sent 
through a tube 700 m in length, reflected 
at the end, and recombined with a ref- 
erence beam to measure the change of 
length Al. Al/l corresponds to the strain 
in this direction. 


Fig. 2.20 Ring laser at Pinon Flat 
Observatory, California. This instru- 
ment measures the rotation (rate) around 
the vertical axis (vertical component of 
V x 
ity) by superposition of two counter- 


v, v being the ground veloc- 


propagating mono-frequent laser beams 
(Sagnac effect). 
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Fig. 2.21 Surface waves. Illustration of 
the particle motion of seismic surface 
waves. Top: Love waves correspond to 
SH-waves (horizontally polarized shear 
waves). Their particle motion 1s perpen- 
dicular to the direction of propagation in 
the horizontal plane. Bottom: Rayleigh 
waves are polarized 1n the plane through 
source and receiver (corresponding to 
P-SV motion). Their particle motion is 
in general elliptical, and retrograde at 
the Earth’s surface. The amplitude of 
both wave types decays with depth. From 
Shearer (2009). 


Fig. 2.22 Surface wave dispersion. In 
layered media, Love- and Rayleigh 
waves are dispersive. Their velocities de- 
pend on frequency. Phase and group 
velocities are shown as a function of 
frequency for a typical layered Earth 
model. Figure courtesy of Gabi Laske. 


The presence of the free surface in a homogeneous elastic half-space— 
assuming incident plane P or SV waves—leads to one additional wave type: 
Rayleigh waves. They propagate in a horizontal direction, are polarized in the 
plane through source and receiver, and travel in a homogeneous half-space 
without dispersion with little less than shear-wave speed. Their amplitude decays 
exponentially with depth. As discussed above, Rayleigh waves in an elastic half- 
space can be generated by an impulsive vertical force acting on the free surface, 
providing an excellent reference to check the correct implementation of the free 
surface boundary condition. 

Interestingly, no such waves exist in an elastic half-space for horizontal motion 
components! However, if we assume a low-velocity layer v; < v2 of depth h below 
the surface, with velocity v2 in the half-space below, we obtain further solutions— 
so-called Love waves with a fundamental mode of wave propagation with velocity 
in between v, and v3. The polarization properties of these surface wave types are 
illustrated in Fig. 2.21. Love waves are always dispersive. 

In layered Earth models Rayleigh waves also show dispersive behaviour. As 
the Earth’s interior (on a global scale)—to first order—can be treated as a layered 
medium, it is useful to take a look at the velocities expected as a function of fre- 
quency. This is illustrated in Fig. 2.22. It is instructive to examine the arrival-time 
difference between long-period and short-period surface waves at long propa- 
gation distances (see exercises). The dispersion curves for Love- and Rayleigh 
waves illustrate the strong dependence of surface wave velocities as a function 
of period (or frequency). It is precisely this feature that dominates teleseismic 
broadband observations as indicated in the initial discussion of observations in 
Fig. 2.1. As we encounter the phenomenon of physical dispersion in the context 
of surface waves I would like to introduce the concept of numerical dispersion, 
which we will dwell on in more detail later. Numerical dispersion is an unwanted 
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effect of discretizing a wave field and needs to be avoided at all costs. Physical and 
numerical dispersion are illustrated qualitatively in Fig. 2.23. The graph shows a 
classic example of Love-wave dispersion (simulated using the finite-difference 
method), obtained for a surface source in a low-velocity zone. Exact parameters 
are irrelevant for this demonstration. We note that low frequencies arrive earlier. 

Exactly the same source receiver set-up for a higher-frequency source wavelet 
and an elastic half-space leads to numerical dispersion effects. Here also higher 
frequencies arrive later. This dispersive effect is entirely artificial and a conse- 
quence of a bad choice of simulation parameters. Note that sometimes these 
effects are hard to distinguish and one of the main goals of this volume is to 
provide guidelines on how to avoid these errors. 


2.6 Seismic sources 


In addition to the structural parameters of the Earth model, the physical descrip- 
tion of the seismic source parameters will affect the resulting wavefield. In the 
following, we briefly review the most important concepts relevant for simulation 
tasks. 


2.6.1 Forces and moments 


As indicated in the elastic wave equations given above, seismic sources can be 
injected (1) as stress perturbations using the moment tensor Mj(x, 4) or (2) as 
external forces f;(x,t). The latter source type involves energy provided through 
some external processes (e.g. a hammer hitting the ground, or pressure sources 
induced by water waves). For seismology and seismic exploration the most im- 
portant source types can be described using the moment tensor and this will be 
the focus of this section. 

The second-order symmetric moment tensor has units of stress [Pa = N/m?] 
with elements 


Mu Mi2 Mis 
M = | Ma; M22 Mp3 | (2.36) 
M31 M32 M33 


each of which describes a double-couple force system as illustrated in Fig. 2.24. 
Before we give some examples, let us ask what the values of the moment tensor 
components mean. 

To answer this question we need the concept of the scalar seismic moment Mo 
that is defined as 


Mo = wAd, (2.37) 


where jz is the shear modulus in the source area, A is the surface area of the rup- 
turing fault plane, and d is the average slip on the fault. This simple equation 
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Fig. 2.23 Physical vs. numerical dts- 
persion. Dispersion is the frequency de- 
pendence of propagation velocities. Top: 
Physical dispersion of Love-waves for a 
half-space with a low-velocity zone at 
the top. Bottom: Numerical dispersion 
Jor the same source—receiver set-up due to 
the discretization of the wavefield with 
insufficient number of grid points per 
wavelength. 
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Fig. 2.24 Seismic moment tensor. Illus- 


tration of the double-couple forces corre- 
sponding to the elements of the seismic 


moment tensor. From Shearer (2009). 


Fig. 2.25 Far-field radiation pattern of 
a shear-dislocation source. The figure 
illustrates the strongly varying radia- 
tion of P-waves (dotted) and shear waves 
(solid) for a double-couple point source 
(indicated at the centre). Note approxt- 
mately five times larger peak amplitude 
of S-waves compared to P-waves. 


is one of the most important results in earthquake seismology, providing the 
link between radiating wavefield, the size of an earthquake rupture, and—if the 
earthquake is large enough—geodetically observable displacements at the Earth’s 
surface as a result of the static slip across the fault. 

The scalar moment can further be obtained through the following relation: 


1/2 


1 
Mo = —=|>/M}] (2.38) 
V2NG 


The seismic moment determines the energy radiated from the seismic source. 
It scales the components of the normalized moment tensor that is responsible 
for the radiation pattern of P- and S-waves. For example, the case of non-zero 
components M12 = M2; corresponds to a double-couple source acting across the 
plane x = 0 in y-direction or across the plane y = 0 in x-direction. This is the well- 
known ambiguity between fault and auxiliary plane. The resulting double-couple 
(dc) moment tensor is 


0 M0 010 
M“=]M, 0 0] =M J100]. (2.39) 
000 000 


Another important moment source is an explosion. The moment tensor 
corresponds to equal forcing along the axis and has the diagonal form 


100 
M*?’— M, 1010], (2.40) 
001 


and the moment Mp has to be appropriately scaled. An important property of the 
elastic wave equation is the fact that it is near with respect to the source proper- 
ties. That means twice the scalar moment leads to twice the observed amplitude 
(in displacement or velocity). Obviously, a physical source like the double-couple 
shown above leads to a complicated radiation pattern of seismic body waves. This 
anisotropic radiation is illustrated for a double-couple point source in Fig. 2.25. 
Its mathematical form is presented in the next section. 

As a first specific example of a simulation in homogeneous media we show 
snapshots of a displacement wavefield emanating from an explosive point source. 
The results are shown in Fig. 2.26. When starting with a community simulation 
code or developing your own code from scratch one should always check such 
simple model set-ups. 

In this example—as expected—the radiation pattern is isotropic when the di- 
vergence of the wavefield is shown. However, note that it does not appear to be 
isotropic when the individual displacement components are visualized. The ori- 
entation of the induced motion also leads to polarity reversals depending on the 
quadrant of the wave field. 


Vxu 


Because we will later compare with a double-couple source, let us also show 
the curl of the wavefield that is exactly zero in the case of an explosion source 
that only generated P-waves. This might be trivial, but in fact there might be 
numerical artefacts that lead the curl to be non-zero. Therefore, it is useful to 
check this property. 

To fully understand the effects of a seismic source described by the classic 
moment tensor we must have a look at the analytical solution to the double-couple 
point source in infinite media. This will be the topic of the next section. 


2.6.2 Seismic wavefield of a double-couple point 
source 


Much of seismology, e.g. the understanding of the kinematic properties of seismic 
sources, the linking of earthquakes to measurable crustal deformation, quasi- 
analytical solutions to wave-propagation problems, etc., is based on a fundamental 
analytical solution to the problem of a double-couple point source in infinite 
homogeneous media. It is an animal of an equation and quite hard to derive. How- 
ever, if you start a complete 3D elastic wave simulation your solution contains all 
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Fig. 2.26 Seismic point sources. Elastic 
wave field for a point explosion source 
at the centre of the homogeneous model. 
Top left: Divergence. Top right: Curl. 
Bottom left: u,. Bottom right: 1,. 
Note that the curl vanishes for an explo- 
sion source in homogeneous anisotropic 
media. Snapshots are calculated with 
a 2D finite-difference scheme. The cen- 
tral ball illustrates the tsotropic radiation 
pattern, indicating equal extension (or 
compression) in each direction. 
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Fig. 2.27 Spherical coordinate system. 
Coordinate system used for the analyt- 
ical solution of a double-couple point 
source. From Shearer (2009). 


10 Dots above symbols denote time 
derivative. 


the elements of this fundamental solution. Therefore, bearing with me for some 
time on this will prove worthwhile. The focus is to understand the result, not 
present the full derivation. One could build a whole lecture around this equation! 
The interested reader is referred to Aki and Richards (2002). 

What is the displacement wavefield u(x, £) at some distance x from a seismic 
moment tensor source with M,, = Mz, = Mo? This question leads to the follow- 
ing solution, the main aspects of which we will briefly discuss. The displacement 
u(x, £) due to a double-couple point source in an infinite, homogeneous, isotropic 
medium is (e.g. Aki and Richards, 2002)!° 


r/B 
1 1 
U(x, t) = an : tMo(t-t)dt 
r/o 
1 1 r 1 1 r 
AP — My(t-—) + AS — Mo (t- 2.41 
jee oC | ee oC ry (2.41) 
1 1. r 1 1. r 
APP — My(t-—) + APS — My(t-—), 
4npa3 r of = An pp? r 0( ry 


with the radiation patterns A™ (near-field), A” (intermediate-field P-wave), A‘ 
(intermediate-field S-wave), A” (far-field P-wave) and A” (far-field S-wave): 


AN = 9sin(20) cos(¢)7— 6 cos(20) cos(p)6 —cos(@) sin(o)o 

A! = 4sin(2) cos(@)?— 2 cos(20) cos(¢))6 — cos(6) sin(b)d 

A'S = -3 sin(20) cos(¢)?— 3 cos(20) cos(o)6 —cos(@) sin(o) 

A’? = sin(20) cos($)? 

APS = cos(26) cos(¢)6 — cos(4) sin(¢)¢, (2.42) 


with the coordinate system shown in Fig. 2.27. Furthermore, in these equations 
r is the distance from the source, p,a,f are density, P-velocity, and S-velocity, 
respectively, and Mo(t) is the source time function. We assume [ Modt = Mo 
where Mo is the scalar moment. Note that this implies that an arbitrary source 
time function s(t) can be initialized with 


Mg [ s(dt = Mo 


/ s(t)dt = 1. 


The far-field terms of the above radiation patterns were used for the illustrations 
in Fig. 2.25. 
Let us have a closer look at the various term in Eq. 2.41. First, we note that the 


(2.43) 


simplest realistic source time function for an earthquake source is a stress drop 
occurring over a finite amount of time, the so-called rise time t, (see Fig. 2.28). 


We start with the situation relevant for regional or global seismology, or if the ratio 
between propagated wavelengths and fault size is large. This is called the far-field 
(ast line in Eq. 2.41). 

In this case, we observe that the displacements for P- and S-waves have the 
same waveform, the time derivative of the moment source time function. That 
means that the ground is displaced with the corresponding direction of motion, 
but goes back to its original position. Another important result is that the far- 
field amplitude terms decay with 1/r. This amplitude decay—the geometrical 
spreading—is the basis for the correction terms in the Richter magnitude scale. 
Furthermore, the amplitude factors reveal that shear wave displacements are ap- 
proximately five times larger than P-wave displacements (highly relevant for the 
damage to buildings during earthquakes). 

The important link between seismology and geodesy is made with the near and 
intermediate terms (top two lines of Eq. 2.41) that contain terms proportional to 
the moment source time function. This implies that there is a static displacement 
that remains when the source has finished acting, and the seismic wave field has 
passed through. These are the terms that—for large enough earthquakes—lead to 
the (sometimes substantial) crustal deformation that can be observed with GPS 
sensors (or, if possible, by integrating velocity seismograms). 

The factors A* are merely direction cosines that determine the radiation pat- 
tern of the various terms with maximum value 1. It is instructive to plot these 
terms to understand this directional behaviour (see exercises). 

Let us give an example of how to connect earthquake properties with this 
equation and how it can be realized in a simulation task. The parameters for our 
earthquake simulation are given in Table 2.3. To obtain the scalar moment Mj for 
an earthquake with moment magnitude M,, = 5 we use the empirical relationship 

2 
My = 3 Cogio Mo -9.1). (2.44) 
Assuming a typical earthquake stress drop of Ao = 5MPa we can determine the 
fault radius r for a circular rupture through 


7Mo 


hee 
6 


(2.45) 
by which the fault surface simply becomes Ay = mr?. With shear modulus ob- 
tained from the information in Table 2.3 we can determine a static displacement 
d from the relation My = A;d. In this example we obtain a slip d * 25 cm and 
with the assumption of a slip velocity of uv.) = 1 m/s we obtain a rise time of 
T, = 0.25 s. With this information we can initialize a source time function Mo (2). 

A smooth source time function (integrated Gauss function) mimicking a rise 
time of approx. 0.25 s and the resulting analytical displacement seismograms are 
shown in Fig. 2.29. The seismograms show a complicated waveform that contain 
both P- and S-wave signals as well as near- and intermediate field terms that result 
in permanent displacements. 
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M(1) 


Fig. 2.28 Source time function. The 
source time function M(t) (top) and its 
time derivative M (t) with rise time Ty. 
This source time function 1s descriptive 
of a stress drop occuring over a finite 
rise time. The rise time determines the 
slip velocity across the fault plane. From 
Shearer (2009). 


Table 2.3 Parameters for model 
earthquake. 

Parameter Value 

Mw 5 

Mo 4 x10!© Nm 
Us 3,000 m/s 

Up 5,196 m/s 

p 2,500 kg/m? 
trise 0.25 s 
source [0,0] km 
receiver [4,4,4] km 
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Fig. 2.29 Analytical seismograms for a 
double-couple source in a homogeneous 
medium. Top: Source time function 
Mo(t) (solid line) and tts time derivative 
(dotted line). Bottom: Displacement for 
a receiver at 6.93 km distance from a 
M,, = 5 double-couple point source. For 
parameters see Table 2.3. 


Fig. 2.30 Seismic point sources. Elas- 
tic wave field for a point double-couple 
source (M,.,) at the centre of the homoge- 
neous model. Top left: Divergence. Top 
right: Curl. Bottom left: u,. Bottom 
right: u,.. Snapshots are calculated with 
a 2D finite-difference scheme. 


For any earthquake modelling it is useful to check simulation tools for such 
basic problems, in particular when synthetic seismograms are being compared 
with observations on absolute scales. We further illustrate the typical simulation 
results for such a problem in Fig. 2.30. In this 2D calculation, snapshots are 
shown for the v, and vz velocity wave fields emanating from an M,, double- 
couple point source. Taking the curl and divergence of the wave field separates P- 
and S-waves. Note the occurrence of nodal lines with vanishing curl, divergence, 
or velocity components, and the polarity reversal on both sides of these nodal 
lines. The velocity snapshots also illustrate the dominant amplitude of the S-waves 
compared to P-waves. Despite the simplicity of this simulation set-up, it is useful 
to spend some time understanding the details of the results before moving to more 
complicated Earth and seismic source models! 

The elastic wave equation has some very important properties that can be 
exploited for the simulation of seismic waves, and for both the inversion for source 
and structure. This is the topic of the following sections. 


2.6.3 Superposition principle, finite sources 


One of the most important properties of the elastic wave equation is its linear- 
ity with respect to the source terms, whether described by volumetric forces or a 
moment tensor. In fact, this property was already used to derive the geometrical 


description of an earthquake source by a force double-couple. This linearity 
is the basis for the zmversion of the moment tensor and finite-fault properties 
from observed ground motions. However, note that the inverse problem for fi- 
nite source behaviour can be strongly nonlinear depending on observed data 
and source parametrization. With the prospects of solving the wave equation for 
some discretized Earth model, how can finite sources—at least in principle—be 
calculated? 

A simple example of a realization of the superposition principle in connection 
with finite sources is given in Fig. 2.31. On a regular computational grid each 
grid point can be initialized with a time-dependent source. In this particular case, 
a subgrid of 8 x 8 points is assembled to form a sub-fault that breaks with the same 
temporal behaviour. Such a source initialization can be used to simulate waves for 
a large finite fault with complex kinematic rupture behaviour. 

Mathematically, the superposition can be elegantly described in the frequency 
domain using the convolution theorem. Assuming a numerical solver that re- 
turns the Green’s function Gj, in terms of ground velocity components v; at one 
receiver point r, the complete velocity seismograms can be assembled by 


N 


vu) (@) = > slip, exp [-iwt, (c"”)] Gi (@)S(R, @), 
k=1 


(2.46) 


where the exponential term is merely a time shift t, that depends on rupture speed 
Crups Slipe is the final slip of source k, and S(R,w) is the spectrum of the source 
time function with R being the rise time. The theory of finite sources is described 
in detail in Aki and Richards (2002). The source description described above 
was used in a recent study by Bernauer et al. (2014) to simulate finite source 
earthquakes. In this case the numerical solver is used to calculate the (normalized) 
Green’s function Gj, for each source—receiver couple. Once this is done, the above 
equation can be used to simulate arbitrary finite source scenarios. 

For simulation and inversion tasks, the superposition principle offers another 
interesting opportunity. When sources add to observed seismograms in a lin- 
ear way, why don’t we add up all seismograms (from earthquakes or man-made 
sources) and use the summed data as supershot or superdata for the inverse prob- 
lem for Earth’s structure? This process is called source stacking or source encoding 
and is indeed a hot topic. Particularly for marine exploration problems where 
sometimes tens of thousands of sources are used, this is an attractive concept. 
Without using source stacking every source would require its own simulation 
which in 3D can be very time consuming. Therefore it is obvious that, when 
many sources are involved, a substantial amount of computation time could be 
saved. A recent example which also highlights some of the problems that appear 
is discussed in Schiemenz and Igel (2013). 

In addition to the superposition principle there is another property of the 
wave equation that might substantially reduce simulation efforts of seismic 
observations. 
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Fig. 2.31 Finite source simulation by 
superposition. The superposition princi- 
ple implies that seismograms for ar- 
bitrary finite sources can be obtained 
by adding up seismograms (numeri- 
cal Green’s functions) from a sufficient 
number of point sources appropriately 
timed and scaled to correctly reproduce 
the kinematic source behaviour. Equal 
colours denote subfaults and correspond- 
ing seismograms summed up to obtain 
the final seismogram. Figure from Wang 
(2007). 
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Fig. 2.32 Reciprocity. The symmetry of 
the elastic wave equation with respect 
to time implies that for arbitrary veloc- 
ity models a seismogram for a source 
at A recorded at B 1s the same as a 
seismogram for the same source at B 
recorded at A. 


2.6.4 Reciprocity, time reversal 


The elastic wave equation is symmetric in time. In other words, wave propaga- 
tion is reversible (we ignore here irreversible effects due to anelasticity). This has 
tremendous consequences for both forward and inverse modelling. Using the con- 
cept of Green’s functions Gj corresponding to the solution for a force in direction 
j recorded in displacement component 7, this property can be expressed as 


Gj (x, t; X05 to) = Gyi (X05 —to3 Xx; -t), (2.47) 


where unprimed and primed variables correspond to source and receiver, respec- 
tively. This property holds if the surface around the volume is traction free. This 
concept is illustrated qualitatively with a simple simulation example in Fig. 2.32. 
A 1D space is initialized with a model with random velocity perturbations around 
a constant background model. We simulate acoustic wave propagation for a 
pressure source at point A recorded at receiver B and vice versa. The figure 
shows that both seismograms are identical. How can this be used in practice? 
An impressive example can be given when looking at a (now typical) marine 
exploration experiment. In many cases marine reservoirs are furnished with 
thousands of ocean-bottom geophones. To monitor temporal changes of the 
reservoirs during production, ships cruise several times a year over the area. They 
send signals hundreds of thousands of times, generating gigantic data volumes. 

When faced with the task of simulating such an experiment one can now be 
clever, and, rather than simulating each source, reverse the problem and simulate 
each receiver (and record at each source). This leads to a tremendous speeding 
up of the process! In addition, using the superposition principle one could stack 
all sources into one supershot (also called source encoding). Unfortunately, some 
problems arise when doing that and care has to be taken. An application of this 
approach to exploration data is presented in Schiemenz and Igel (2013). 

Reciprocity also allows the concept of t2me reversal (which is just another way 
of expressing reciprocity). Recorded seismograms from a point source can be 
reinjected—flipped in time at the receiver locations. The back-propagating wave 
field will focus in space-time at the original source coordinates. The quality of this 
focusing will strongly depend on the receiver coverage and, for real observations, 
on the knowledge of the Earth model. In practice this principle can be used to 
image the spatio-temporal behaviour of large earthquakes. 

A simulation example is given in Fig. 2.33. A 2D finite-difference algorithm 
is activated with a source at the centre of a square domain. The wave field is 
recorded on a circular array surrounding the source point. It is recorded long 
enough such that the entire direct wave field has passed and the receiver points are 
at rest. In a subsequent simulation the stored seismograms are flipped in time and 
reinjected at the receiver points as sources. The many sources superimpose and 
part of the wavefield travels back to the source point and constructively interferes 
at the correct time (when the reversal wave propagation is carried out in the same 


model). Examples using 3D simulations for time-reversal studies are given in 
Larmat et al. (2006) and Kremers et al. (2011). 

Structural imaging (reverse-time migration, full-waveform inversion using ad- 
joint techniques) and source imaging are all based on reciprocity. Some aspects 
of this are further discussed in the application sections at the end of this volume. 


2.7 Scattering 


If you are planning to use numerical methods to simulate elastic wave propagation 
you definitely want to simulate waves through a more or less complicated het- 
erogeneous Earth model (otherwise you could use analytical or quasi-analytical 
tools). How can we characterize heterogeneous Earth models? How do the prop- 
erties of the Earth model to be simulated affect our choice of computational 
solutions? These questions leads us to the problem of scattering, a topic (again) 
that could be extended to an entire volume, but we will only scratch at the surface 
(see recommendations at the end of this chapter). 

Any energy propagating in an elastic medium is affected when it encounters a 
change in medium properties. The most important notions in this context are the 
so-called correlation length a of the medium changes and the (let us say dominant) 
wavelength i of the propagating wavefield. The correlation length is a somewhat 
strange concept and sometimes it might not even exist, but for the moment let 
us assume that it represents the dominant spatial wavelength of the medium per- 
turbation you want to investigate.!! Mathematically, the correlation length can 
be determined by looking at the central part of the autocorrelation function of 
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Fig. 2.33 Time reversal. Top row: 
Acoustic wave field at increasing itera- 
tions for a point source in the middle 
recorded in a circular array (crosses). 
The wavefield is calculated with a 2D 
finite-difference scheme. Bottom row: 
Seismograms recorded at the stations 
are flipped in time and re-injected as 
sources at the corresponding receiver 
locations. The back-propagating wave- 
field focuses at the original source loca- 
tion. This methodology is for example 
used to destroy kidney stones. 


1" At this point in my lecture I usually 
refer to concert halls. Sometimes concert 
halls contain columns of a certain diame- 
ter. Never sit behind these columns as the 
wavelength of audible frequencies might 
well be of that order. In that sense the di- 
ameter would be the relevant correlation 
length. 
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Fig. 2.34 Structural heterogeneities. Il- 
lustration of heterogeneity classes that 
impact the choice of specific numeri- 
cal solvers for seismic wave-propagation 
problems. The grey scaling indicates 
variations in seismic velocities (or den- 
sity, or elastic parameters). a: Layered 
media; b: layered media, laterally het- 
erogeneous; c: single scattering object; 
d: medium with cavity (boundary condi- 
tions apply); e: smoothly varying hetero- 
geneities; f: heterogeneities on all scales. 


medium perturbations. In Gaussian and exponential media the correlation length 
ais related to the dominant wavelength of the heterogeneities in the medium. 

Let us have a look at some classes of structural heterogeneities and discuss 
some of the implications for simulations. Examples are given in Fig. 2.34. A 
layered medium (see Fig. 2.34(a)) is an extremely important class of models. 
Sometimes it is sufficient to model observations (e.g. simple sedimentary struc- 
tures, geotechnical problems). For these model classes quasi-analytical schemes 
(such as the reflectivity method, or normal mode solutions for global wave prop- 
agation) exist and one would normally not choose numerical methods for their 
solution. However, because of these quasi-analytical reference solutions, lay- 
ered structures represent important benchmark problems against which purely 
numerical solutions can be tested. 

Once layered models become laterally heterogeneous, quasi-analytical meth- 
ods fail and one has to use numerical solutions. The graph in Fig. 2.34(b) 
indicates that the material changes are abrupt. It is important to realize that 
this implies that infinite spatial frequencies make up the model (a fundamental 
property of 6 or step functions). In general, internal boundary conditions are not 
explicitly implemented; however, this layered model raises the question whether a 
computational grid needs to follow the interfaces or not (more on this later). 

Sometimes one is interested in understanding the scattering behaviour of sin- 
gle objects, as in Fig. 2.34(c). Analytical solutions exist for simple structures; 
otherwise numerical solutions can be used. If the scattering object inside the 
medium is an empty cavity (Fig. 2.34(d), e.g. from an explosion, mining struc- 
tures) special stress-free boundary conditions might apply. Then computational 
meshes have to be specifically designed to follow the cavity shape. Smoothly 
varying velocity models as indicated in Fig. 2.34(e) are usually well suited for 
modelling with numerical tools. Models with all spatial wavelengths involved 
(Fig. 2.34(f)) might be problematic. For all models, the question of what seis- 
mic wavelengths will propagate through the medium is essential. In many cases, 
it might not be necessary to characterize the medium with high spatial wavenum- 
bers to obtain accurate solutions. In other words, a discontinuous (e.g. layered) 
medium could be replaced by a smooth version with (almost) no difference 
to the seismogram obtained for the original model. This is the problem of 
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homogenization, one of the hottest topics in computational wave propagation today 
(see Chapter 10 on applications). 

Finally, let me present a way to characterize the phase space of scattering prob- 
lems. This concept goes back to Aki and Richards (1980) (my favourite figure in 
this epic book that did not make it into the second edition). Fig. 2.35 allows us 
to understand a simulation (or scattering) problem in terms of propagation dis- 
tance (in terms of wavelengths) and correlation length (scaled by wavelength). In 
common terms, scattering effects are strong when the wavelength i and the size 
of the material perturbation, here characterized by autocorrelation length a, are 
about the same (so around 1 on the vertical axis). 

When the scatterers are very small, the medium can be replaced by an equiva- 
lent homogeneous medium (e.g. homogeneous anisotropic medium due to small 
cracks). On the other hand, if the medium pertubartions are very long wave- 
length (ka >> 1), they can be ignored and the medium—at least locally—can be 
considered as a juxtaposition of homogeneous parts. 

In an intermediate area ray theory can be applied. In this case 4 << a, the 
waveform is not altered by the medium perturbations, but the ray path and arrival 
times are affected. In the scattering regime 4 * a, when waves propagate many 
wavelengths, numerical methods are too expensive and wave propagation can be 
treated by special theories (e.g. radiative transfer theory). Finally, the domain 
where numerical simulations play the most important role is the scattering regime 
A & a, when the propagation distance is not too large. Anything beyond O(100) 
wavelengths in 3D is still a challenge today. 

It is instructive to review recent (and old) literature and place the simulation 
tasks into the phase space of Fig. 2.35 (see exercises). Comparing such points for 
simulation tasks 20 years ago and today, it is a little frustrating that we have not 
moved very far (but, yes, it is a logarithmic scale!). 
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Fig. 2.35 Phase space of scattering 
problems. Scattering problems character- 
ized by correlation length a and propa- 
gation distance L scaled by wavelength 
X (after Aki and Richards, 1980). The 
computational method of choice for a 
particular seismic wave simulation prob- 
lem depends strongly on the scattering 
property a/X and the propagation dis- 
tance in terms of wavelengths .. Numer- 
ical methods are required for the strong 
scattering regime (a © 
stricted to some maximum propagation 


d) but are re- 


distance due to limits of computational 
resources. See text for details. 
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2.8 Seismic wave problems as linear 
systems 


In the sections above we have already made use of the concept of Green’s func- 
tions. Green’s functions Gj(x;, t; Xo, fo) are the impulse response for a source at 
Xo at time f in directon 7 recorded for ground motion component 7 at point x. 
Green’s functions contain all information on a linear system, and the partial differ- 
ential equation describing elastic wave propagation (and the Earth for that matter) 
can be treated as a linear system. While this property is used extensively for the 
mathematical description of wave propagation and finding analytical solutions, 
here, we would like to demonstrate the consequences for numerical solutions 
and simulation studies. There is no such thing as an exact (delta-like) impulse 
in nature. Therefore, we want to know Earth’s response for a reasonable arbitrary 
source time function, which we denote as s(t). 

It is a well-known result that the response to a linear system for an arbitrary in- 
put can be obtained by convolving the (arbitrary) input with the Green’s function. 
Thus, in simple mathematical terms 


ui (Xt) = Gy (x,t, xX0) @ sO, (2.48) 


where ® denotes convolution, the source time fo = 0. 
According to the convolution theorem, this corresponds to a multiplication in 
the spectral domain; thus 


uj(X1) = F"[Gy(@)S)], (2.49) 


where .¥~! denotes the inverse Fourier transform and S(w) is the (complex) 
source spectrum. 

These relations also hold for the numerical solutions to the elastic wave equa- 
tions. However, we cannot expect that the Green’s function Gj will be evaluated 
accurately for all frequencies (note that the spectrum of the impulse 6-function 
is white and contains all frequencies). Denoting the numerical solutions by the 
~ sign we obtain 


ii X t) = Gi t, Xo) ® s(t), (2.50) 
where Gi(x, t,Xq) is a Green’s function calculated using a numerical solver. 
Again, this convolution corresponds to a multiplication in the spectral domain; 
thus 


ii;(x,0) = F"(G;(@)S(@)]. (2.51) 


We will illustrate this with a simple example using the 1D acoustic wave equation 
and discuss the potential of this relation. In Fig. 2.36 a simulation problem is 


set up in a homogeneous acoustic 1D medium. The analytical response of an 
impulse injected at some source point is a step function shifted by A c where A is 
the distance between source and receiver and c is the acoustic velocity. 

Initializing a numerical solver (here: a basic finite difference solution to the 
acoustic 1D wave equation as discussed later in the volume) will result in the 
numerical Green’s function G(t) shown in Fig. 2.36. This raw result is useless 
as it contains numerical artefacts. However, the fact that our numerical solver 
is also a linear system allows us to perform the convolution after the simulation. 
This is demonstrated in the bottom figure. Both convolving the raw numerical or 
analytical Green’s function with the desired source time function s(t), and directly 
injecting the source time function s(t) in the simulation lead to the same result, 
provided that the frequency range of the source time function is chosen such that 
the final result is accurate (more on this later). 

While this might seem a technical aspect, it has tremendous practical con- 
sequences. Think of a situation in which you would like to investigate the 
frequency-dependent effects of waves through a random velocity model. The 
above properties imply that you can do this with a single simulation, altering 
the frequency content later, in the convolution step. Other potential applications 
are numerical Green’s function data bases for subfault systems. In principle, by 
appropriate superposition and convolution, arbitrary finite source scenarios can 
later be assembled without further simulations. Another powerful example of this 
concept is the recently published Jnstaseis project providing high-frequency global 
seismograms almost instantaneously using pre-calculated numerical Green’s 
functions (van Driel et al., 20150). 


2.9 Some final thoughts 


The material presented in this chapter is intended to motivate you to (1) 
stick to some simple problems when you start using a numerical solver for 
wave-propagation problems, and (2) play around with some of the powerful 
functionalities (e.g. reciprocity, superposition, linearity, convolutions) of numer- 
ical simulations before working on more complicated problems. Trust me; this 
recommendation comes from experience with many students and researchers, 
who sometimes too quickly dive deep into the numerical simulation realm, and 
are surprised by the strange results they obtain. Here is a checklist of possi- 
ble points to consider when trying to understand whether a solver is returning 
correct answers (often these points can be answered by looking at wavefield 
snapshots): 


e Are P- and S- arrival times correct? 

e Does your seismogram correctly reflect your input signal (for example in 
homogeneous media)? Careful: sometimes there is an integration, some- 
times there is a derivative involved. 
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Fig. 2.36 Green’s functions and sim- 
ulations. Illustration with a numeri- 
cal approximation of the 1D acoustic 
wave propagation using the finite dif- 
ference method. From top to bottom: 
a: (Arbitrary) source time function s(t). 
b: Analytical Green’s function G(t). 
c: Numerical Green’s function G(t) ob- 
tained with a numerical solver. d: Com- 
parison of analytical solution (solid), nu- 
merical Green’s function convolved with 
s(t) (dashed), and numerical solution ini- 
tiahized with s(t) as source time function 
(shifted for illustration purposes). The 
results are identical. 
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Is there an analytical solution you can compare your results with (homoge- 
neous medium, Lamb’s problem)? 


Is the radiation pattern according to your specific source model (explosion, 
double-couple) ? 


Is the wavefront shape correct? 


Is the relative amplitude of waves (P, S, surface waves) according to 
expectations? 


Is the particle motion (i.e. polarization) compatible with the wave type? 


Are the polarities of the waves correct in the various directions? 


Many of these aspects can be tested with the Python programs provided in the 
supplementary material. 


Chapter summary 


Seismic wave propagation consists of oscillatory phenomena. The wave- 
lengths involved are usually quite small compared to the total physical 
domain under consideration (planets, continents, reservoirs, volcanoes, 
rock samples). This implies challenging problems when discretizing the 
internal structures. 


The relation between wavelength 4 and (dominant or maximum) fre- 
quency f of the seismic wavefield c = if, where c is phase velocity, 
is important to understanding the spatio-temporal scales for a seismic 
simulation problem. 


Seismic wave propagation is governed by a vectorial partial differen- 
tial equation with the displacement- (or velocity) field as the unknown. 
Analytical solutions only exist for simple (homogeneous) media. 


Body waves (P- and S-waves) with corresponding polarization and veloc- 
ities are solutions to the elastic wave equation in homogeneous full space. 
Surface waves (Rayleigh and Love) are solutions to elastic half- (or layered 
half-) spaces, respectively. 


The most important physical boundary condition for seismic wave propa- 
gation is the stress-free surface boundary condition. 


Various rheologies (e.g. isotropic, anisotropic, viscoelastic, poroelastic) 
determine the stress-strain relationship. 
Sources of seismic waves can be described with force terms f(x,t) or the 


seismic moment tensor M(x, ft). 


The superposition principle implies that any finite (or distributed) source 
can be considered an integral (or sum) over point sources. 
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e The elastic wave equation is symmetric in time. Sources and receivers can 
be interchanged. The seismograms are the same. 


e The fact that the wave equation can be treated as a linear system has 
powerful implications for simulation problems. 


FURTHER READING 


e@ Shearer (2009) provides a very comprehensive introduction to the concepts 
of seismic wave propagation and earthquake sources at a basic level. 

e Stein and Wysession (2003) is another introductory textbook to seismology 
with a nice section on linear inverse problems. 

e Aki and Richards (2002) is the theoretical seismologist’s bible. For any in- 
depth discussion on the theoretical derivations of many results, refer to this 
book. 

e Ben-Zion (2003) provides some of the most important equations for 
seismology with a focus on results relevant for earthquake physics. 

e Kennett (2001) (and the subsequent volume) discusses seismic wave 
propagation with a strong data point of view. 

© Moczo et al. (2014) is a book on the finite-difference method but provides 
a detailed description of the governing equations for a variety of rheologies. 
They also cover the problem of absorbing boundary conditions. 

® Carcione (2014) provides an in-depth discussion of wave propagation for 
various rheologies and a discussion of several numerical methods (including 
absorbing boundary conditions). 

e Sato et al. (2012) is a compilation of articles in the field of seismic 
scattering. 

e Rienstra and Hirschberg (2016) develops the analytical solutions for the 
acoustic wave equation (and others) in a comprehensive way. 


EXERCISES 
Comprehension questions 
(2.1) Search for recent or current research projects using seismic methods (e.g. 


exploration seismics, global seismology, volcanology, laboratory studies, 
geotechnical problems). Collect information on frequency ranges and 
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(2.2) 


(2.3) 
(2.4) 
(2.5) 
(2.6) 
(2.7) 
(2.8) 


(2.9) 


2.10) 


2.11) 


2.12) 


2.13) 


size of sensor networks, and discuss consequences for seismic simulation 
problems. 

Search for papers with seismic simulations, and extract information on 
propagation length and scattering properties of the Earth models. Place 
these parameters in the phase space of scattering problems shown in 
Fig. 2.35. 

What geophysical parameters make up the Earth model when the 
isotropic (anisotropic, viscoelastic) wave equation is used? 

Seismic velocities are functions of inverse density. Doesn’t that mean that 
the denser the medium, the slower seismic velocities are? Explain! 

What is the difference between the velocity—stress and the displacement— 
stress elastic wave equation? Is the solution the same? 

Describe the various rheologies for seismic wave propagation. How 
would you rate them in terms of modelling real seismic observations? 
What is reciprocity? How can this principle be used in seismic wave 
problems? 

What is time reversal in the context of the wave equation? Find applica- 
tions in seismology and medicine. 

Explain qualitatively the physical model for an earthquake point source. 
What parameters would you expect in a file that initializes an earth- 
quake simulation? Are the point source properties uniquely defined given 
seismic observations? 

Explain the concept of wave dispersion using Love and Rayleigh waves. 
Describe their dispersive behaviour for various basic Earth models (half- 
space, layered half-space). 

What boundary conditions are relevant for seismic simulation problems? 
Give examples. 

Explain the vector wave field operators gradient, divergence, and curl 
for seismic wave simulations (and observations). Why are seismic array 
measurements relevant in this context? 

Explain the concept of linear systems, convolution, the convolution the- 
orem, and its relevance for wave simulations. What is the difference 
between analytical and numerical Green’s functions? 


Theoretical problems 


(2.14) 


(2.15) 


Get information on the PREM model for global Earth structure (see 
Fig. 2.2) and calculate the maximum and minimum seismic wavelengths 
(P- and/or S-waves) for frequencies 1.0,0.1,0.01 Hz. Where do they 
occur? 

Using the basic form of the 3D isotropic elastic wave equation, derive the 
2D version by assuming invariance of all fields in y-direction. Show that 
you obtain two independent (sets of) equations. 


(2.16) 


(2.17) 


(2.18) 


(2.19) 


(2.20) 


(2.21) 


(2.22) 


(2.23) 


(2.24) 


(2.25) 


Write out all components of the 3D isotropic elastic wave equation in 
Ux, Uy, Uz in the displacement formulation. Follow the strategy presented 
in Eq. 2.3. 

Inject the trial solution p(x, 2) = poe’? into the source-free 1D acoustic 
2 a2 


wave equation 37 = c*d?p. Discuss the solution. 
Show that p(x, t) = f(x- ct) + f(x + ct) is a general solution to the wave 
equation 37 = c?d2p. Discuss the result. 

Assume two monochromatic plane waves propagating in x-direction: (a) 
P-wave u, = A, sin(kx—wt) and (b) S-wave uw, = Ay sin(kx—at). Calculate 
in both cases the elements of stress and strain tensors. Assume that it is 
possible to observe the z-component of the curl V x u. The rotation 
rate around a vertical component is given as the time derivative of the 
curl applied to the displacement field. How is the vertical component of 
rotation rate related to the transverse acceleration 071, (S-wave)? Would 
the P-wave contribute to the curl? Discuss the potential of this result. 
Follow the approach described in the previous exercise. Find a way to 
obtain phase velocity from collocated measurements of strain and dis- 
placement (or velocity or acceleration) from body waves in an infinite 
space. Which components do you have to combine? 

The 2003 Hokkaido earthquake (M8.1) led to a maximum horizontal 
displacement of 1.5 cm for Love waves for an approximately 25-second 
period recorded in Germany. Estimate the maximum dynamic strain 
induced by the passing wavefield for a horizontal phase velocity of 5 km/s. 
Show that for attenuation the relation for the amplitude decay A(t) = 
Ape 22 holds if 6 = In(4;/A2) relates two subsequent amplitudes, Q = 
mz / 6 and the wave propagates one cycle. 

What is the ratio between maximum S- and maximum P-wave ampli- 
tudes in the far field of a homogeneous medium for a double-couple 
point source? Use Eq. 2.41 and discuss implications for engineering 
seismology. 

Estimate the difference of arrival times for Love and Rayleigh waves 
propagating at various periods [T = 50 s, 200 s] to a distance of 
10,000 km. Refer to Fig. 2.22. 

Show (e.g. graphically) that 


8y(x) = e 2a (2.52) 


Ta 


converges to a d-function as a > 0. Show that f 5,(x)dx = 1 for any a. 


Computational exercises 


(2.26) 


Write a computer program that uses vertical incidence reflection and 
transmission coefficients (ignore multiples) to calculate Green’s func- 
tions for a 1D model with a few layers. Apply the convolution model to 
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27) 


2.28) 
2.29) 


2.30) 


(2.31) 


(2.32) 


the Green’s function and calculate synthetic seismograms convolving the 
Green’s function with a source time function (e.g. a Gaussian) according 
to Eq. 2.48. Discuss the results. 

Use Eq. 2.41 to write a program for far-field Green’s functions in arbitrary 
directions. Investigate the radiation pattern and polarization behaviour of 
body waves. 

Plot the 3D radiation patterns A* for P and S far-field energy in Eq. 2.41. 
Write a program that plots the scalar moment Mp as a function of energy 
magnitude M,, (Eq. 2.44). 

Stress drops Ao usually vary between 1 and 10 MPa. Use the relation 
between stress drop, scalar moment, and (circular) rupture radius to plot 
the expected radii for varying magnitudes for given stress drop. Carefully 
check physical units! 

Write a computer program to check the reciprocity principle with the 
far-field solutions of Eq. 2.41. 

Write a computer program that initializes a random 2D velocity pertur- 
bation by spatially low-pass filtering random numbers using transform 
methods. 


Waves in a Discrete World 


In the last chapter we discussed the physics of wave propagation, seismic sources, 
and some of the phenomena to be expected when seeking solutions to seismic 
wave-propagation problems. Wave propagation—like most other phenomena in 
physics—is well described by partial differential equations defined in a continuous 
world. That is fine, as long as we find analytical solutions to the problems we pose, 
for example, how waves propagate in infinite homogeneous media. 

Obviously, if we want to get synthetic (or theoretical) seismograms for arbi- 
trary Earth models, this approach does not work. We need to find alternative 
strategies and solve the problem using a (possibly large) computer. This implies 
that we have to move to the discrete world. In other words, anything we describe 
(e.g. an Earth model with seismic velocities, a displacement field) will have a finite 
number of degrees of freedom.' 

The fact that we will operate in the discrete world raises a lot of questions. 
How will we describe space- or space—time-dependent fields in a fully discrete 
way? What is the impact of the dimensionality (1D, 2D, or 3D) of our problem to 
its numerical solution? What strategies exist for the initialization of computational 
meshes? How do we deal with problems in various geometries (e.g. Cartesian, 
cylindrical, spherical, arbitrary). And finally, how are large-scale problems solved 
on modern (parallel) computers? 

Many of these questions are relevant for the understanding of the specific 
numerical method applied to the seismic wave-propagation problem. Therefore, 
we briefly illustrate the concepts behind these issues and indicate their connection 
with the methods discussed in the following chapters. 


3.1 Classification of partial differential 
equations 


The general properties of specific partial differential equations are extremely 
important for finding efficient numerical solutions. Therefore, let us briefly 
investigate the properties of our elastic wave equation focusing on the 1D 
acoustic example. Assuming two independent variables space x and time ¢ and 
x-dependent variable c(x) (acoustic velocity), the source-free partial differential 
equation (pde) for the acoustic wavefield p(x, £) reads 


Pu- C Pex = 0, (3.1) 


where (only in this section) the lower indices imply differentiation. 
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1 Examples: If you  parametrize 
a homogeneous half-space with P- 
and S-velocities, then that number 
is 2. Parametrizing a reservoir with 
1,000 x 1,000 x 1,000 grid points implies 
2 x 10° d.o.f. (nothing unusual these 
days!). 
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Fig. 3.1 Hyperbolic conic sections. Par- 
tial differential equations are classified 
by analogy with conical sections. The 
elastic wave equation is of hyperbolic 
form. A hyperbola (white lines) is ob- 
tained when a vertical plane cuts a cone. 


First, we note that this pde is linear, which is the case when all terms involving p 
(and its derivatives) can be expressed as a linear combination. Another require- 
ment for linearity is that the coefficients (here: c(x) and unity) are independent 
of p. The coefficients may depend on the independent variables (here: x). In our 
problem this will certainly be the case: The coefficients of the wave equation de- 
scribe Earth’s elastic properties and will vary as a function of space. The fact 
that the wave equation is linear is tremendously important as it allows us—at least 
under some circumstances—to come up with analytical solutions. We have seen 
some of those already in the previous chapter and they are very important to ver- 
ify the accuracy of the numerical solutions. This is in general not the case for 
nonlinear pdes. 

Second, pdes are classified with respect to the highest-order derivative that ap- 
pears in the equations. In our case this is second order. It turns out that the wave 
equation can also be written as a first-order system of equations that is formally 
equivalent (this will be shown later in connection with the finite-volume method) 
to the advection (or transport) equation, here shown in the homogeneous scalar 
1D case 


Pi + chy = 0, (3.2) 


where space-time dependence is implicit. Furthermore, partial differential equa- 
tions are classified by analogy with conic sections. Writing the general form of a 
linear, second-order equation in x, t as 


Apxx + Box + Cou + Dpx + Ep, + Fp = 0, G.3) 


where the capital letters are the coefficients, the partial differential equation can be 
transformed (e.g. by a Fourier transform) into an analogous form corresponding 
to conical sections such that 


Ax? + Bxt + Ctt+ Dx + Et+ F =0, (3.4) 


where the independent variables take on another meaning (e.g. wavenumber and 
frequency). To classify the pde one needs to calculate the discriminant B? — 44C 
to find the category, where 


B?-4AC =0 = parabolic 
B’-4AC <0 = elliptic (3.5) 
B’-4AC>0 = hyperbolic. 


With these classes it is easy to see that Eq. 3.1 is hyperbolic for all possible 
coefficients c(x). This is also true for the complete vectorial wave equation. The 
wave equation is the classic hyperbolic pde. Once initial conditions for the field and 
its time derivative—for example, p(x, t = 0) and 0,p(x, t = 0)—are specified, the 
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solution at all times is fixed (in the absence of any further input). The solutions 
to hyperbolic equations are wavelike and disturbances travel with finite prop- 
agation speeds. This distinguishes them from elliptic and parabolic problems, 
where perturbations of initial conditions or boundaries have an immediate effect 
everywhere. 

What about the first-order advection Eq. 3.2? In fact, the second-order wave 
equation can be obtained from the first-order advection equation by a few simple 
steps (see chapter on the finite-volume method). As hyperbolic problems appear 
everywhere in physics, numerical schemes for their solution can be transferred 
from other areas of physics (or from engineering) to seismology. Let us have 
a look at some fundamental concepts concerning numerical solutions to partial 
differential equations. 


3.2 Strategies for computational wave 
propagation 


Our problem is finding numerical solutions to wave equations as indicated by the 
two examples given above. It is useful to consider the space- and time-derivatives 
of these equations separately to understand and categorize the methods we will 
encounter. Luckily, as our wave equations are linear, we can write them as 


8; P(x, 2) = L(p.t) > L(p,) = c(x)’ 8 p(x 1) 
or (3.6) 


Op~(x%,t) =LO,t1) > LOD = cx) dcp, O), 


where L(.) denotes a linear operator, and space-time dependencies are added for 
clarity. Note that the right-hand sides contain only spatial derivatives, or space- 
dependent functions like c(x). Assuming that the space-dependent fields p and c 
can be discretized appropriately in space for a given time, this formulation can be 
termed a semi-discrete scheme. 

An important statement at this point is that all the numerical methods 
presented in this volume—despite their sometimes fundamentally different un- 
derlying mathematical concepts—differ primarily in the way the right-hand side 
of the above equations are treated! With the appropriate initial condition for a 
wave-propagation problem (everything is at rest), the left-hand side becomes 
an extrapolation problem that will always be solved by a finite-difference-type 
approximation. In that sense the right-hand side can be considered a general 
interpolation problem. 

Because one of the main distinctions between the various methods is the way 
the space-dependent fields are described, we will have a closer look before diving 
into the details. Durran (1999) distinguishes two basic strategies for spatial dis- 
cretization: the grid-point method and the series-expansion method. We illustrate 
these strategies in Fig. 3.2. 
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Fig. 3.2 Basic 
tion schemes. 


spatial _—discretiza- 
Top left: Grid point 
approximation of a function f(x) 
(e.g. finite-difference method). Top 
right: Function approximation by 
sum over trigonometric functions (indi- 
cated by dotted lines, normalized and 
shifted). The original discrete function 
f(x)(dashed line) is replaced by an 
approximation (solid line) that exactly 
interpolates f(x) at the grid points (dots), 
e.g. pseudospectral methods. Bottom 
left: Space is divided into elements. 
Inside these elements the function f(x) 
(solid line) is approximated by a linear 
function (dashed line), continuous across 
element boundaries (e.g. finite-element 
method). Bottom right: Space 1s 
divided into finite volumes (cells) and 
the function f(x) ts approximated by 
the average value (located at the cell 
centre indicated by a dot), e.g. finite- 
volume method. Note the occurrence 
of discontinuities across cell (volume) 
boundaries. 
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The grid-point method approximates an arbitrary function f(x) at a discrete 
set of points (Fig. 3.2, top left) and only there. This is the principle used in the 
finite-difference method. Values are usually not required in between grid points, 
but if they were, they would have to be obtained by interpolation. 

An entirely different strategy is to approximate a function by a sum over some 
basis functions (e.g. a Fourier series expansion) on the entire domain (Fig. 3.2, 
top right). For certain grid-series combinations (e.g. regularly spaced grid points 
and Fourier series) it turns out that an arbitrary function can be exactly interpo- 
lated (together with its derivatives) at the grid points. That is a cool property 
also used inside elements in the spectral-element or the nodal discontinuous 
Galerkin method. When the calculation of derivatives is carried out using Fourier 
transforms we speak of the Fourier pseudospectral method. 

In finite-element-type methods (Fig. 3.2, bottom left) space is divided into 
elements (possibly of varying size) and in the most basic approach the solution 
fields are approximated by linear functions inside the elements that are continu- 
ous across the element boundaries. Higher-order polynomial approximations (e.g. 
quadratic functions or Lagrange polynomials) are also possible, leading to more 
accurate representations. 
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Another possible discretization is based on defining finite volumes or cells in 
which the average value of a function is defined (Fig. 3.2, bottom right). In- 
side the cells higher-order approximations (linear, polynomial) are also possible. 
However, a fundamental difference is the fact that the approximate fields inside 
the cells are discontinuous across the cell boundaries. This requires the definition 
of fluxes as information needs to be exchanged between cells. This discretization 
scheme is used both in the finite-volume method as well as in the discontinuous 
Galerkin method. 

In the following sections we will investigate what discrete computational 
models can look like in various dimensions. 


3.3. Physical domains and computational 
meshes 


We perceive our world as a 3D space and describe its properties with functions 
of coordinates x, y, z or 7,0,@. Therefore, it seems natural to discretize a physical 
domain for a wave-propagation problem in 3D. Obviously, this is computationally 
the most expensive option. Sometimes a problem can be reduced to 2D (rarely 
to 1D) with tremendous benefits for computational costs. While the reduction 
to lower-dimensionality seems trivial there are a few things to be aware of, in 
particular if one wants to compare synthetic seismograms with observations. 


3.3.1 Dimensionality: 1D, 2D, 2.5D, 3D 


The bulk of this volume is dedicated to the presentation of various numerical so- 
lutions to the wave equation in 1D (with an exception in the chapter on finite 
differences, with 2D examples). The reason is that once the concepts are under- 
stood in 1D they can be extended to higher dimensions in a straightforward way.” 

The field of computational seismology started some decades ago with 2D sim- 
ulations, simply because at that time computational resources did not allow 3D 
calculations. Let us illustrate some of the consequences by looking at the graphs 
in Fig. 3.3. Nothing much needs to be said about 1D calculations. They have 
merely tutorial value. However, 2D Earth models can be of substantial use to 
understanding wave phenomena. 

In 2D all space-dependent fields become invariant with respect to one dimen- 
sion (here: y). In other words, all space derivatives with respect to this variable 
vanish, leaving two independent variables (here: x and z). Because numerical 
methods are employed to study heterogeneous structures this implies that all het- 
erogeneities defined in the x — z plane also are invariant in y. For example, a 
circular velocity perturbation turns into a cylinder, a point scatterer becomes a 
line scatterer, etc. 

However, the most important consequence is the fact that the spatial source 
function also becomes invariant in y. In 2D a point source defined at some 


Fig. 3.3 Dimensionality. The dimen- 
stonality of the computational domain 
is important for the choice of discretiza- 
tion and meshing. In 2D it is important 
to recognize the invariance of all fields 
in the third (here y) direction including 
the source (line source). Fractional di- 
mensions like 2.5D indicate the fact that 
while the computational domain 1s 2D 
the behaviour of the wavefield (e.g. in 
terms of geometrical spreading) 1s 3D. 


2 That’s what they always say. Indeed 
going to higher dimensions may imply far 
more complicated book-keeping 
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Fig. 3.4 Honouring (internal) struc- 


ture. One of the key questions in compu- 
tational mesh generation 1s whether it is 
necessary to honour the geometry of (in- 
ternal or surface) structures (bottom), or 
whether tt is sufficient to have inaccurate 
(e.g. blocky) representations of interfaces 
that do not honour their structure (top). 
(Figure courtesy of Martin Kaser.) 


3 Compared to an impulsive response in 
a 3D medium, the response of a 2D point 
(i.e. line) source involves convolution with 
1/,/t, thus a vanishing tail. For SH-wave 
(or acoustic) propagation in a 2D medium 
waveforms can be converted to 3D seismo- 
grams by deconvolution with 1/,/t. In the 
general case this is not possible. This prob- 
lem was recently investigated by Forbriger 
et al. (2014). 


coordinate [Xs7¢5 Zsr¢] actually becomes a line source. This implies that for any 
source time function there are always contributions from the line source to the 
seismogram recorded at any point in the x, z plane.* This can be analysed with the 
2D numerical schemes available in the supplementary electronic material. Hav- 
ing said that, it is obvious that 2D numerical calculations cannot be compared 
directly with observations (at least without carefully considering what aspects can 
be compared; but this goes beyond the scope of this text). 

Because for a long time it was impossible to carry out 3D calculations for any 
realistic Earth structures, there were attempts to find ways around these restric- 
tions. This led to the development of so-called 2.5D schemes (a strange-sounding 
concept!) in which at least the problem of the incorrect geometrical spreading due 
to the line source could be fixed. This can be achieved with analytical tools or by 
moving to other coordinate systems. 

An example is given in Fig. 3.3. For global wave propagation—assuming 
invariance along the lines of constant latitude ¢—the problem of global wave 
propagation expressed in spherical coordinates can be reduced to a 2D com- 
putational domain in r,@. When the source is centred at the axis 9 = 0 (which 
is a singularity in the wave equation, but there are ways around that), the cor- 
rect 3D geometrical spreading is obtained. A powerful recent example of this 
approach is Instaseis, a high-frequency solver for global wave fields based on the 
spectral-element method (van Driel et al., 201508). 

A further issue arising for > 1D problems is the specific coordinate system used 
to describe the elastic wave equation. There are fundamental differences when 
writing the wave equation in cartesian, cylindrical, or spherical coordinates (see 
exercises). The standard procedure to obtain computational meshes (e.g. regular 
spaced grid points along the axis) obviously leads to very different meshes. This 
will be further discussed in the next section, as well as in the Chapter 10 (on 
applications). 


3.3.2 Computational meshes 


The generation of computational meshes (or grids; both terms are used synony- 
mously) is a field in itself and a subdomain of computational geometry. Even 
though in the remainder of the volume we restrict the description of numerical 
methods mostly to 1D solutions, the very basic concepts of meshes and mesh- 
ing shall be briefly introduced. In 2D and 3D the choice of the specific mesh 
on which geophysical parameters are initialized is tightly linked to the numerical 
method used to approximate the wave equation. 

Let us take a simple example to illustrate a fundamental problem when 
choosing a computational mesh. In many cases seismic velocity models are char- 
acterized by surfaces at which there are jumps (discontinuities) in geophysical 
properties (e.g. seismic velocities or density). Other curved features include the 
topography at the Earth’s surface or internal faults with arbitrary geometrical 
shapes. Fig. 3.4 illustrates the problem in 2D when the target is to model a curved 
internal boundary. 
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When a simple regular grid is used, the boundary cannot be exactly repro- 
duced (honoured). It is represented or replaced by a blocky structure. On the other 
hand, a spatial discretization scheme based on triangles has no problem in follow- 
ing (honouring) the internal structure. This simple illustration intuitively suggests 
that the latter kind of mesh certainly seems better. It turns out that from a com- 
putational (and mathematical) point of view the two strategies are actually very 
different. It is fair to say that the unstructured type of meshes are more difficult 
to solve. This will become clear when the methods that allow accurate solutions 
for such meshes (e.g. finite-element-type methods) are being discussed. 

The question of the honouring vs. not-honouring strategy is currently a 
hot topic of research, in particular for seismic wave propagation. This relates 
to the potential of replacing an Earth model that has discontinuous struc- 
tures (like the one in Fig. 3.4) with a low-pass version that is smooth but 
leads (within some small error margin) to the same seismograms. This pro- 
cess is called homogenization and will be further discussed in the chapter on 
applications. 

We proceed with a basic classification of meshes. 


3.3.3 Structured (regular) grids 


Structured or regular grids are subdivisions of (1-, 2-) 3D space characterized 
by a regular connectivity. From a computational point of view this implies that 
the mesh can be defined uniquely by vectors (1D) or matrices (2D or 3D). 
Examples are given in Fig. 3.5. In its most basic form a 3D regular mesh con- 
sists of brick-like parallel epipeds the corners of which can be addressed with 
indices 7, 7, k and physical coordinates (idx,jdy, kdz), where dx, dy, dz are the 
side lengths of the hexahedral structures. Historically, the numerical solutions for 
seismic wave-propagation problems (e.g. using finite-difference methods) started 
with simulations on such regular meshes. 

The use of regular meshes works fine, as long as (1) the geophysical model is 
sufficiently smooth and no complex geometries have to be obeyed, and (2) seismic 
velocities do not vary too much.’ Regular grids are also possible when discretiz- 
ing the wave equation in other coordinate systems (e.g. spherical coordinates, 
Fig. 3.5(b)). However, the specific characteristics of regular meshes in spherical 
(or cylindrical) coordinates leads to problems due to the spatially varying grid 
cells (see section on applications in global seismology). 

The problem with honouring (at least smooth) internal or external surfaces 
can be fixed if an analytical representation of these surfaces can be obtained. 
Then, grids can be stretched by a curvi-linear coordinate transform (Fig 3.5(c)). 
It is quite obvious that for general surfaces (like Earth’s topography) this is not 
possible. The advantage (compared to the use of unstructured meshes) is the fact 
that one can stick to vector and matrix descriptions of internal structures. 

In seismology (to some extent also in exploration seismics) one can go a long 
way with relatively smooth Earth models and structures favouring structured 
mesh approaches. Only in the past decade have strong efforts been undertaken 


Fig. 3.5 Structured (regular) grids in 
2D and 3D. a: Regular, equi-spaced 2D 
grid in cartesian geometry; b: multi- 
domain regular 2D grid in spherical co- 
ordinates; c: regular, stretched grid that 
follows smooth surface; d: regular 3D 
cartesian grid with blocky topography 
surface. e: regular 3D grid in spheri- 
cal coordinates (section) with grid points 
based on Chebyshev collocation points; 
Ff: regular surface grid of sphere meshed 
with the cubed-sphere approach. After 
Igel et al. (2015). 


‘Tt is hard to give a precise thresh- 
old here, but when the velocities vary by 
an order of magnitude, regular grids are 
certainly sub-optimal. This situation oc- 
curs for example in strong-ground motion 
problems where surface low-velocity lay- 
ers have to be honoured, and in global 
seismology. 
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Fig. 3.6 Connectivity. Space is dis- 
cretized with nine elements and ten 
vertices. The connectivity consists of ma- 
trices containing the list of vertices, ele- 
ments, and neighbours (and sides in 2D 
and 3D). 
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Fig. 3.7 Unstructured grids in 2D and 
3D. a: Unstructured grid based on De- 
launey triangulation with cross-cutting 
interface; b: Voronoi cells for unstruc- 
tured grid; c: tetrahedral mesh for the 
Matterhorn (mountain in Switzerland); 
d: tetrahdral mesh for spherical Earth 
model. After Igel et al. (2015). 
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to find efficient numerical methods that work for Earth models defined on 
unstructured meshes. 


3.3.4 Unstructured (irregular) grids 


The fundamental difference of unstructured grids is the fact that they can no 
longer be expressed as a vector (or matrix) on the computer without further 
specifications. Unstructured grids require the definition of the so-called con- 
nectivity. What does this connectivity look like? Even though trivial, the 1D 
example of connectivity matrices shown in Fig. 3.6 illustrates the added com- 
plexity of unstructured grids and the substantial larger requirements of computer 
storage. 

Examples are shown in Fig. 3.7. Consider a set of points defined in the 2D 
plane (e.g. information on geophysical parameters is known at these locations). 
You want to divide space up into elements (cells) making use of these points. One 
way of doing this is the so-called Delauney triangulation. There is a (non-unique) 
way of connecting the points with triangles. Delaunay triangulation breaks up 
a point set into triangles such that no (other) point is inside the circle defined 
by each triangle’s points. Triangular (or tetrahedral) discretization is the most 
common meshing strategy for highly complex structures. 

Another strategy to subdivide space into elements or cells starting from ar- 
bitrary points is the concept of Voronoi cells (Fig. 3.7(b)). Take any two points 
and determine a line that is equidistant between them (thus orthogonal to the 
line connecting the points passing through the middle). This is followed by con- 
necting up the intersecting points from the surrounding points. The cells so 
defined imply that within them each point is closer to the relevant point than 
to any other. These structures play an important role in many branches of com- 
putational geometry and physics (e.g. they are reminiscent of the basalt columns 
forming through rapid cooling). Voronoi cells can also be used to find interpo- 
lation (or differential) weights on unstructured meshes, and to solve nonlinear 
inverse problems. 
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From a mathematical/computational point of view, regular grids are in general 
more efficient and converge faster to the correct solutions. The question of which 
type of mesh is better for a specific geoscientific problem is not easy to answer. 
Truly unstructured grids are the method of choice for dynamic rupture problems 
for faults with irregular geometry and for problems with highly complex geomet- 
rical features (e.g. boreholes, strong surface topography). More on this in the 
chapter on applications. 


3.3.5 Other meshing concepts 


Some of the observations made above indicate that, despite their flexibility, un- 
structured meshes might lead to substantially more computational work. This 
raises the question of whether structured and unstructured grids could not be 
merged and used in those physical domains where they are most efficient. Again 
this is a topic of current research. An example is shown in Fig. 3.8. This ap- 
proach is called a hybrid mesh, where a structured mesh is used in the area with 
little structural complexity and smooth velocity variations. In the region with com- 
plex geometry (here the free surface topography) a triangular mesh is used that 
is more easily generated than a corresponding rectangular mesh that follows the 
surface. 

When physical problems in large computational domains are strongly focused 
in space (e.g. shock waves, rupture fronts), then it might make sense to den- 
sify the grid during run time in the area where things are happening. This is 
called adaptive-mesh refinement (AMR) and plays an extremely important role 
in geophysical fluid dynamics. For strongly scattering seismic wave-propagation 
problems the use of adaptive meshes is usually not advisable as—after relatively 
short simulation time—energy propagates almost everywhere in the medium and 
influences the resulting synthetic seismograms (depending of course on source— 
receiver geometry). It is a matter of current research whether for dynamic rupture 
problems adaptive mesh refinement should be employed. 

Another scheme that is currently being exploited for seismic wave propagation 
for static and dynamic mesh refinement is the octree approach. Octree is a tree 
data structure. Meshes are refined in a progressive wave. In 3D, subdividing each 
side by a factor of 2 leads to 8 children. An example is shown in Fig. 3.9. If the 
goal is to densify a mesh around linear structures like the one indicated in the 
figure (e.g. an interface in a sedimentary layer) the mesh is progressively refined 
until the desired accuracy of the solution is obtained. 

This is useful for static mesh refinements and may also be used for dynamic 
mesh refinement during simulations. This approach was used to model strong 
ground motion in combination with the finite-element method by Bielak et al. 
(2005). 

In this section we have encountered a variety of ways to discretize space. How- 
ever, what we have not yet discussed is how we go from the description of a 
geophysical model to its computational mesh. 
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Fig. 3.8 Hybrid meshes. A structured, 
regular grid in the lower domain with 
low degree of geometrical complexity 1s 
combined with an unstructured trian- 
gular grid at the top of the domain 
with a complicated free surface. A strong 
low-velocity domain (black area) 1s also 
meshed with an unstructured triangular 
mesh. From Hermann et al. (2011). 
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Fig. 3.9 Octree approach. An octree 1s 
a tree data structure. Each node has 
eight children. Octrees are used to par- 
tition a 3D space by recursively subdi- 
viding it into eight octants (here shown 
in 2D). This approach allows efficient 
refinement of linear or curved internal 
structures or surfaces. Figure courtesy of 
Vasco Varduhn. 
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> There is no Nobel Prize for Earth 
sciences; the Crafoord Prize comes closest. 


3.4 The curse of mesh generation 


In a recent meeting, a computational seismologist named his presentation: 
Meshing—an underestimated task. That almost says it all. The efficient genera- 
tion of computational meshes is (1) often difficult, (2) time consuming, and 
(3) Earth scientists are usually not trained for it. I think it is fair to say that— 
while many excellent solvers exist today for the accurate simulation of 3D wave 
propagation—we are far away from having standard work flows for the generation 
of computational meshes for wave propagation and rupture problems. 

The situation will get worse in the sense that for leading-edge research ques- 
tions we are using 3D simulation tools to investigate the response of fairly 


complicated models (you won’t get the Crafoord Prize? 


with 1D simulations). 
Now that community codes like specfem3d or SeisSol exist that take as input ex- 
ternally generated meshes, researchers who want to use those codes need to deal 
with mesh generation. Here, we will only introduce some fundamental concepts 
that help to understand the issues involved. 

Let us take an example. Grenoble is situated in a beautiful valley in the French 
Alps (and also hosts one of the best labs in seismology in Europe). Like most of 
the alpine valleys, Grenoble is built on fairly thick layers of alluvial sediments that 
might resonate when seismic waves enter. With some potential for sizeable earth- 
quakes in the vicinity, it is worth investigating the ground motion expected for 
some realistic scenario. As the velocity structure of the area is fairly well known, 
a few years ago a community benchmark project was set up (Chaljub et al., 
20100). The goal was to compare solutions from various numerical solvers for 
this challenging problem. 

What is the input for such a geophysical model? First, there is the surface to- 
pography that—given the alpine setting of Genoble—has to be taken into account. 
Surface topography is usually given in terms of a digital elevation model (DEM) 
that consists of a regular surface grid in appropriate coordinates with the eleva- 
tion in vertical direction. Second, the location of the interfaces of the sedimentary 
layers might be given in a similar form, or (even better) as parametric surfaces. 
Third, the location of internal fault surfaces might be given, on which (finite) seis- 
mic sources have to be activated. How (on Earth) can we create a computational 
mesh with this information? 

Mesh generation has two major steps: (1) geometry creation, and (2) mesh 
generation. Geometry creation involves the definition of surfaces bounding the 
mesh volume. This can be the free surface, internal interfaces, or the domain 
boundaries. Often, this is the most time-consuming process. For example, when 
arrays of grid points (or even unstructured points) describe surfaces a parame- 
teric description has to be found (e.g. using spline functions). Often this involves 
simplifying and subdividing geometric structures manually. Finally, surfaces have 
to be joined up to create a closed volume. 

The mesh-generation process takes as input the created geometry and 
subdivides the entire volume into grid cells. At this point, the decision that has to 
be taken, according to which solver is going to be used, whether to use hexahedra 


or tetrahedra as basic element structure (no other element types are currently 
in use for large-scale 3D simulations in seismology). In Fig. 3.10 an example of 
a hexahedral mesh for the Grenoble valley model is shown. Hexahedral meshes 
cannot usually be obtained fully automatically, and require substantial manual in- 
teraction during the mesh-generation process. Once the mesh has been created 
and grid points are defined, the geophysical parameters can be initialized and 
boundary conditions can be associated with the surfaces. 

Concerning automatic mesh generation, tetrahedral elements have substantial 
advantages. Once the geometry is defined, tetrahedral meshes of high quality can 
usually be obtained in an automated way, considerably speeding up the overall 
meshing work flow. However, this comes at the price of substantially higher com- 
putational costs for the solver. This will become clear when discussing the various 
numerical methods. 

An example of a tetrahedral mesh for a volcano structure is shown in Fig. 3.11. 
Unstructured grids have several advantages. In this case, as the interest is in mod- 
elling waves right under the volcano summit, the structure of interest can be 
embedded in a half sphere in an elegant and efficient way. 

There are a number of (mostly commercial) programs available to solve 
the geometry- and mesh-generation workflow for hexahedral meshes.® To my 
knowledge, the most commonly used meshing software at the present time is 
CUBIT (with a linked open source library called GEOCUBIT providing specific 
functionalities for Earth sciences) 

If you are planning a simulation task involving mesh generation from scratch, 
the following rough estimate of timing might be useful. Table 3.1 indicates how 
much time is spent for a meshing and simulation problem. Of course this might 
vary substantially depending on the specific geometries to be meshed. It is a 
warning, however, that the effort for meshing should not be underestimated. 


3.5 Parallel computing 


When you download a community code for 3D wave propagation (e.g. specfem3d 
or SezsSol) it will be a parallel implementation of the specific numerical method. 
What is parallel computing? Why do we need it? How does it work? What 
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Fig. 3.10 Hexahedral mesh. Compu- 
tational mesh for the Grenoble basin 
code verification exercise. The mesh is 
based on curved hexahedral elements. 
The mesh is refined towards the centre 
of the model (sedimentary basin) where 
seismic velocities are substantially lower. 
Interfaces are not honoured. This mesh 
was used for spectral-element simula- 
tions. From Igel et al. (2015). 


Fig. 3.11 Yetrahedral mesh of Merapi 
volcano, Indonesia. The volcano 1s em- 
bedded in a hemispherical mesh with 
densifiction in the summit area where the 
mesh follows the topography. From Igel 
et al. (2015). 


Table 3.1 Timing estimates for mesh- 
ing and simulation problems 


Human Simulation CPU 
time workflow time 
15% Design 0% 
80% Geometry 10% 
(weeks) creation, meshing 

5% Solver 90% 


Source: E. Casarotti, personal communication 


6 Examples are: CUBIT, TRELIS, 
GMESH, GID, ABAQUS, ANSYS, 
HEXPRESS, and others. 
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Fig. 3.12 The Flynn taxonomy. Classi- 


fication of computer architectures. 


7 We will encounter John von Neumann 
(1903-1957) again when analysing nu- 


merical solutions. 


programming languages are used to parallelize codes? Does it matter which com- 
puter it will be running on? No doubt, one could write entire books answering 
these questions! The goal here is to provide you with some basic concepts and ter- 
minology that should allow you to understand the documentation provided with 
current parallel community codes. Some oversimplifications are unavoidable. The 
computational scientists may excuse. 

The original approach for computing based on the von Neumann’ model is 
serial, that is, one operation is done at a time operating on single data. This com- 
puting model obviously has its limitations. First, it is limited by the time it takes 
to do one operation step (a cycle). The continuing pursuit was on until recently 
to increase the so-called clock rate. This raised in particular the temperature (and 
energy consumption) of the processing units (until laptops started catching fire!). 

For many problems (e.g. cryptography, simulation of physical phenomena, 
Monte Carlo sampling) that require large computational resources, (1) the same 
operation has to be carried out on different data (data parallelism) or (2) different 
tasks need to be carried out in parallel to obtain a final result (task parallelism). 
Therefore, the efforts to develop computer hardware that works in parallel started 
early. Computer scientist Michael Flynn came up with a terminology that allowed 
the classification of computer models. It is still very useful today, and is illustrated 
in Fig. 3.12. 

The classic serial model (SISD) was extended to allow a single instruction to 
be applied to a large amount of data (SIMD) distributed in different processors. 
An example is a matrix that is multiplied by a real number. In this case, no in- 
formation exchange between processors is needed (this is called embarrassingly 
parallel). However, if you would like to know the maximum value of the matrix, 
information needs to be exchanged (see exercises). One of the first such (mas- 
sively) parallel computers dedicated to the Earth sciences was installed in 1990 at 
the Institut de Physique du Globe in Paris in connection with the visionary work of 
Albert Tarantola and Peter Mora in the Geophysical Tomography Group (G"G). 

The SIMD architecture can be viewed as memory that is distributed and 
accessed by a front-end computer. This simple architecture works very effi- 
ciently for image processing and also for some numerical simulation approaches. 
Recently this type of architecture has been revived through the use of GPU 
technology. It appears that history played a substantial role in the evolution of 
parallel computing. The early architectures were strongly supported by military 
applications (e.g. cryptography), allowing the development of machines (like the 
Connection Machine CM2) with relatively limited domains of application. Other 
support came from the geophysical exploration industry which saw opportunities 
to dramatically speed up seismic processing and imaging. The end of the Cold 
War led to fundamental changes in the funding of computer technology and the 
requirement for parallel computers with more flexibility. Support from military 
sources faded. 

At least to some extent this accelerated the development of the so-called 
MIMD computer models where basically each processor can perform different 


tasks in parallel. Such parallel computers are a lot more flexible and can improve 
computational efficiency for most conceivable problems. This initiated a route 
through technical developments in terms of hardware and programming software 
that is best characterized as disruptive technology. But more on this later. 

Let us have a look at the consequences of parallel computing for simulations 
of dynamic physical systems. 


3.5.1 Physics and parallelism 


Nature works in parallel. The same physical laws (as far as we know) are valid 
everywhere in space. The partial differential equations describing processes like 
elastic wave propagation or geophysical fluid dynamics are statements of this 
parallel nature. Take the example of (source-free) 1D wave propagation 


87 p(x, t) = e (x) 02 p(x, t), 


where p(x, 2) is the pressure, and c(x) is the velocity model. Writing this equation 
using finite-difference approximations to the partial derivatives (see next chapter 


(3.7) 


for the derivation) leads to 


2 
p(x, t + dt) =? (x) 4 [p(x + dx, t) — 2p(x, t) + p(x— dx, 1)] 
+ 2p(x, t) — p(x, t— dt), (3.8) 


where dt and dx are temporal and spatial increments. This mathematical state- 
ment can be phrased as: 


The (immediate) future t + dt of a physical system—here the pressure 
p(x,t+ dt) at some point in space x—depends (only) on the values in its 
immediate neighbourhood p(x + dx), the presence p(x, 2), and the recent past 
p(x, t— dt). 


This illustrates (maybe clearer than the original partial differential equation) that 
the spatio-temporal interaction of elastic wave (and many other) phenomena is 
of a local nature. This property has tremendous implications for the paralleliza- 
tion of this type of problem (at least when finite-difference type approximations 
are used): Assuming that space-dependent fields are distributed across many pro- 
cessors (today O(10°)), it appears that—when communication is only necessary 
between neighbouring processors—this is very efficient (see Fig. 3.13). Clever 
networking schemes can alleviate problems with non-local communication. 

This degree of locality of a numerical algorithm has important consequences 
and might differ substantially depending on the specific mathematical approxi- 
mations used. In fact, the need for non-local communication has hampered the 
use of some (very elegant and accurate) methods for wave-propagation problems 
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Fig. 3.13 Parallel processing. Design 
study following a _ form-follows- 
function approach for one of the first 
parallel computers, the Connection 
Machine CM-1 of Thinking Machines 
Corp. with connected cubes indicating 
processors that communicate with each 
other. A photo of famous physicist 
Richard Feynman (who designed the 
communication scheme) wearing a 
Fshirt with this logo was used by 
Apple™ in their ‘Think Different’ 
campaign in the nineties. The actual 
design of the CM-1 was based on this 
logo; apparently the only time that a 
computer was designed after a T-shirt. 
Design concept by and figure courtesy of 
Tamtko Thiel. 
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Fig. 3.14 Domain decomposition. 
When space-dependent fields are dis- 
cretized the question 1s which processor 
gets which chunk of data. Even for 
regular meshes in 1D or 2D (top) 
this 1s not obvious. Here each colour 
represents a spectfic processor (or node). 
Partitioning volumes for unstructured 
grids in 3D (bottom), optimizing the 
load balance of processors (or nodes), can 
be a challenging task. 


(e.g. pseudospectral methods) on parallel hardware, because the communication 
overhead is too high. 


3.5.2 Domain decomposition, partitioning 


How can we make best use of parallelization for our wave-propagation problem? 
All numerical methods discussed in this volume (and most 3D wave simulation 
codes currently used in the seismological community) are based on time-space 
domain numerical solutions to the wave equations. This implies that space- 
dependent fields (e.g. displacements, stresses, strains, elastic parameters, seismic 
velocities, density) are mapped on parallel hardware by domain decomposition 
using the distributed memory concept. That means the tasks only see local memory. 

For regular 1-3D grids this is usually a trivial task, as shown in Fig. 3.14, 
top. The discretized spatial domain, specified as vector or matrix shape accord- 
ingly, is subdivided into n areas of equal size, m being the number of available 
processors. A further, less obvious task is that of deciding which processor gets 
which chunk of the subdivided volume. In Fig. 3.14, processors are illustrated 
by different colours and various strategies for their allocation exist. As indicated 
above, communication between processors might well depend on the physical 
distance between processors, which means, depending on the communication re- 
quirement of a specific problem, the mapping may have a strong impact on the 
computational efficiency. 

This problem is much harder for unstructured grids, with the added complex- 
ity that each element might require a different amount of operations per time step 
(e.g. when space-dependent fields are described with different polynomial order, 
this is called p-adaptivity). Fig. 3.14 (bottom) is an example of a mesh partition- 
ing and mapping onto various processors, optimizing the load balancing for the 
volcano mesh. With load balancing one tries to keep all processors happily active 
until some synchronization between the program parts is carried out. 

If you ask how this can be done efficiently, and how portable such codes 
are, you are asking the right questions. Even for regular meshes, optimizing load 
balancing and overall performance has become an issue that is best left to com- 
putational scientists. That is one of the reasons why we are increasingly using 
community codes that have been developed with the help of computational scien- 
tists. That makes it more difficult to understand what is going on inside (just like 
with every new generation of cars). 

Well, one reason for writing this volume is to provide a look under the bonnet. 


3.5.3. Hardware and software for parallel algorithms 


We have learned that we make use of parallel computers by distributing our 
meshes (at best equally) across the available processors. Most likely when you 
read this, you will already have encountered some programming language that is 
used on parallel hardware such as C, C++, Fortran, or Python. The remaining 


questions concern (1) how to tell a program to run in parallel and (2) how your 
parallel coding depends on the specific hardware. 

Before describing the situation as it is today, allow me to go back to the time 
when the first (SIMD-type) parallel computers appeared on the market.? Due 
to their relatively simple architecture, high-level extensions to standard languages 
like Fortran (e.g. Connection Machine Fortran, CMF) were provided with the 
software that made parallel code development relatively easy. 

With the increasing complexity of hardware (e.g. the CM-5 was an MIMD 
machine), and the variety of vendors of parallel machines, high-level pro- 
gramming solutions for simple parallel problems (like domain decomposition) 
vanished, despite attempts like High-Performance Fortran (HPF), Cray Fortran 
(CRAFT), and others. What survived was the more flexible approach based on 
the Message-Passing Interface (MPI), which allows solutions for most types of 
parallel problems, albeit with a substantially bigger overhead for parallelization of 
serial programs. The situation, as far as programming was concerned, got even 
more complicated with the appearance of hybrid architectures and GPU clusters. 

In a panel discussion a few years ago with several representatives of parallel 
computer vendors, it was acknowledged that this is a very unfortunate situation 
for domain scientists who are used to developing their own software solutions. 
The consequence was a paradigm shift in the development of efficient parallel 
simulation software that is still ongoing. Today, efficient parallel software de- 
velopment is only possible (and yes, it is fun) by permanent interaction and 
collaboration with computational scientists. 


3.5.4 Basic hardware architectures 


‘To understand the existing programming models for parallel algorithms it is 
necessary to introduce the basic hardware models. In fact, due to recent de- 
velopments (e.g. in GPU technology) computational scientists often speak of 
hardware-software co-design. That means that sometimes hardware is designed 
to be optimal for a specific software problem (e.g. image processing) and vice 
versa (e.g. simulation software is programmed such that it is optimal on specific 
hardware). 
Some basic parallel hardware concepts are illustrated in Fig. 3.15. These are: 


Shared Memory: All processors have direct access to common physical memory. 
This is a model where parallel tasks see the same state of memory and can access 
the same logical memory locations no matter where physical memory exists. 


Distributed Memory: Architecture with distributed memory implies network- 
based access to physical memory that is not common. A computer program only 
sees local memory. To access memory from other machines (processors) specific 
communication is necessary (e.g. message passing). 


Hybrid Distributed-Shared Memory: A parallel computer that consists of dis- 
tributed nodes, each of which is a parallel shared-memory model with a certain 
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Fig. 3.15 Basis parallel architectures. 
Top: Shared 
Middle: Distributed memory archi- 
tecture. Bottom: Hybrid model with 


memory architecture. 


several shared-memory nodes. 


8 In fact, in early 1990, three weeks af- 
ter I started my PhD at the Institut du 
Physique du Globe in Paris, the Connec- 
tion Machine CM-2 was delivered, and 
our research group had almost exclusive 
access to it. The programming environ- 
ment was a dream compared to today. 
Domain parallelization was basically au- 
tomatic. The programming overhead for 
parallelization was almost nil. In addition, 
application engineers assisted with the sci- 
entific code development. 
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Fig. 3.16 Parallel supercomputer. The 
hybrid supercomputer SuperMUC lo- 
cated at the Leibniz Supercomputing 
Centre in Munich, Germany, with 
241,000 cores, which together delivered 
almost 7PFlop in 2015. SuperMUC 
uses a revolutionary form of warm water 
cooling. The buildings are heated reusing 
this thermal energy. 


number of processors. At the time of writing this is a popular model for flexible 
supercomputers (e.g. the SuperMUC at the Leibniz supercomputing centre in 
Munich, Germany, Fig. 3.16). While, in principle, hybrid computers can be 
programmed with message-passing concepts, in practice code that can exploit 
them most efficiently needs to be optimized at both the shared-memory and 
distributed-memory levels. 

As indicated above, the industry of parallel computing started with relatively 
homogeneous assemblies of processors connected by an appropriate network (like 
the Connection Machine CM-2 or CM-5). With the increasing power of personal 
computers (PCs) in the 90s, and the evolution of free operating systems like 
Linux, the concept of parallel clusters (mostly Linux-based) emerged. Local area 
networks (LANs) could then in principle be used as parallel computers. Today, 
departmental-scale clusters with O(10? — 10*) processors are still attractive alter- 
natives for meso-scale simulation tasks to using supercomputer resources (with 
potentially long queueing times). 

Soon after, the cluster concept was taken to a higher level, linking up comput- 
ers that were not (more or less) physically collocated. This was the birth of the 
GRID initiatives linking many (really geographically) distributed, heterogeneous 
computational resources at local, national, and international levels. It is fair to say 
that cloud computing developed out of the concepts of GRID Computing, provid- 
ing on-demand resources. With a few exceptions, GRID and cloud computing 
have not (yet) played an important role for seismic simulations, but this might 
well change in the near future. 


3.5.5 Parallel programming 


What programming models are used today to parallelize seismic simulation soft- 
ware? Most solvers implemented on classic CPU-based supercomputers or com- 
puter clusters use the Message-Passing Interface standard (MPI, http://www.mpi- 
forum.org). As there are a massive number of tutorials and information on 
the Web, we will just present the basic underlying concept. MPI consists of li- 
braries that can be called from Fortan, C, and C++ programs. Recently, MPI 
implementations with Python have also been developed (e.g. pyMPI). 
Consider the following MPI Fortran code example: 


! My first MPI program 
program main 
use mpi 
integer error 
integer id 
integer np 
! Initialize MPI. 
call MPI Init ( error ) 
! Get the number of processes. 
( MPI_COMM WORLD, np, 


call MPI _Comm_size error ) 
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Get the individual process ID. 

call MPI_Comm_rank ( MPI_COMM WORLD, id, error ) 
Print a message. 

write (*,*) "The overall number of processors is ",np 
write (*,*) "I am processor ",id 

Shut down MPI. 

call MPI Finalize ( error ) 

stop 


end 


In this Fortran90-MPI program the MPI library is called and the program is 
initialized for np processors. After compilation into an executable program like 
main.x, the program can be executed using the mpirun command as 


> mpirun -np 4 main.x 


which will run the executable independently on each of the four processors in this 
case, with the result 


I 


I 


I 


I 


where the sequence is indeed random, depending on the clock cycle in which 
the output is performed in each processor. This simple example illustrates the 
flexibility of the MPI concept. Identical programs run independently on any 
processor of the computer cluster until there is a statement to exchange infor- 
mation between processors. Tasks for individual processors can be given using zf 
statements that identify the processor-zd. The interested reader is referred to the 
available online material for further examples. For shared-memory models, ad- 20 
ditional libraries exist such as openMP (http://www.openmp.org), which exploit 
the specific architecture. As indicated above for hybrid models, the shared- and 
distributed-memory programming models might have to be combined to achieve 
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optimal performance of a specific parallel computer code. 
How do you measure whether a parallel code is performing well? This Fig. 3.17 Speed-up of parallel software. 


question leads to the concept of scaling or scalability. Scalability refers to the 
speeding up of a program with increasing resources. A simple equation that 
describes scalability is 


1 details). 


speed-up = (3.9) 


where P is the fraction of the code that can be parallelized, S is the serial fraction, 
and n is the number of processors. This is illustrated in Fig. 3.17. It is interesting 
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The speed-up is given as a function of 
the fraction of the code that ts paralleliz- 
able P and the serial part S (see text for 
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° The spectral-element code specfem 3d 
won the Gordon Bell Award for Best Per- 
formance at the SuperComputing 2003 
conference in Phoenix. 


Fig. 3.18 Strong scaling. Example of 
strong scaling behaviour of the Seis- 
Sol seismic simulation software based on 
the discontinuous Galerkin method. The 
simulation of wave propagation through 
a volcano is discretized with about 10° 
tetrahedra. Flops refers to floating point 
operations per second including zero op- 
erations. The non-zeros values refer to 
the scaling for calculations not involving 
any zeros. The obtained absolute peak 
performance 1s quite sensational for a 
real application code leading to the code 
being a finalist for the Gordon Bell Prize 
in 2014. Figure courtesy of Michael 
Bader and Alexander Breuer. 


to note how adding more processors does not lead to much more speed-up for 
even high percentages of parallelizability. For many problems (like seismic wave 
calculations) usually the parallel fraction increases with problem size, improving 
the speed-up for large problems. It is easy to understand that I/O might substan- 
tially deteriorate the performance of parallel applications. Therefore parallel I/O 
is a hot topic, particularly for data-rich problems (see next section). 

There are two classes of scaling. Strong scaling investigates how the run time 
varies with the number of processors for a fixed size problem. Weak scaling refers 
to the run time when additional processors with the same problem size are added. 
A recent example of strong scaling for the SezsSo/ code based on the discontinuous 
Galerkin method is shown in Fig. 3.18. This remarkable scaling behaviour as well 
as the obtained peak performance (1.09PFlops) was the result of several years of 
performance optimization (Breuer et al., 2014) which was recognized by being 
nominated as a finalist of the Gordon Bell Prize in 2014.? 

Finally, it is worth noting that—similar to the efforts in the car industry to re- 
duce the fuel per 100 km and CO 2 emissions—supercomputing is going green. 
Green computing refers to the attempt to develop environmentally sustainable 
computing. Supercomputers burn energy comparable to small cities. For exam- 
ple, the analysis of the impact of convergence order, CPU clock frequency, vector 
instruction sets, and chip-level parallelism on the execution time, energy con- 
sumption, and accuracy of the obtained solution for the SezsSol software led to 
a reduction of the computational error by up to five orders of magnitude, while 
increasing the accuracy order of the numerical scheme from 2 to 7; and all this 
while consuming no extra energy (Breuer et al., 2015). 

As the number of processors inside supercomputers, Linux clusters, PCs, 
tablets, and even your smartphone increases, scaling to large processor numbers 
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is an extremely important issue. When requesting resources on national or 
international supercomputer infrastructure you have to demonstrate that your 
program scales. Again, this is an issue that requires interaction with computational 
scientists in most cases. 


3.5.6 Parallel I/O, data formats, provenance 


Let us make a simple calculation. You are running a simulation on a supercom- 
puter for an Earth model discretized on a grid with 1,000 x 1,000 x 1,000 = 10° 
grid points (nothing unusual). You might want to output some snapshots, for ex- 
ample of the seismic wave field, to analyse the results. How would you extract this 
information on a supercomputer with data-distributed architecture? How would 
you store such a file? How would you post-process it? Would you transfer it 
to your local institutional PC and work with? Well, one file in double precision 
would have 8 GBytes. When looking at snapshots you usually have hundreds or 
thousands of them, so you quickly have TBytes of data. This is not exorbitant 
compared to other fields like astronomy where sometimes Pbytes of data are gen- 
erated from observations. Nevertheless, it is obvious that you are likely not to be 
able to easily handle, transfer, or post-process this data on usually serial (or mildly 
parallel) hardware locally. 

The question of how to input (e.g. an Earth model) or output (e.g. snapshots) 
from a parallel computer leads to parallel I/O formats. Obviously, for computers 
with distributed memory and many cores the worst option would be to send all 
data to one main processor and then output it serially. Parallel I/O allows data 
to be input and output in parallel to a storage device, implying that, outside the 
supercomputer, parallel storage facilities are available. Libraries like MPI allow 
parallel I/O, and the interested reader is referred to the relevant documentation. 
In many fields the increase of data volumes has led to domain-specific parallel data 
formats that adapt to the needs of a specific scientific domain, allowing efficient 
data exchange and data sharing. The main goals are: (1) handling large volumes; 
(2) coping with complex data (e.g. time series with difference formats, sampling 
rates); (3) provision for heterogeneous data (e.g. binary seismic data, earthquake 
information, earth model infomation); (4) providing options for simulation data 
Gn addition to observations); and (5) access to provenance information (e.g. what 
happened to the data previously, which solver was used for synthetic data). 

An attempt to provide a novel data format for seismology is currently un- 
der way (called Adaptible Seismic Data Format ASDF, Krischer et al., 2016), 
that addresses some of these problems. It is based on the /ierarchical data format 
approach (HDF) with its current implementation HDF5. HDF5 is a standard 
format for binary data with a large amount of tools and support. The seismology- 
specific adaptation consists of a container that stores three fundamental categories 
of information: data (source and waveforms), metadata (attributes for a piece of 
waveform data), and provenance (description of how a piece of waveform data 
was generated). The contents are illustrated in Fig. 3.19. 
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Fig. 3.19 Adaptable Seismic Data 
(ASDE  hittp://www.seismic- 
data.org). A file with seismic data 
containing not only the seismic traces 


Format 


themselves but additional station infor- 
mation, and information on earthquake 
parameters, synthetic seismograms, and 
provenance (let’s not forget the data 
itself!). The format 1s based on the 
hierarchical data format approach. 
Figure courtesy of Lion Krischer. 
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A data format like this has the potential to (1) substantially reduce time-to- 
research if accepted by the community, and (2) make results more reliable and 
reproducible. There is a multitude of formats for seismic observations alone.!® 
Until recently, that multitude was complemented by another, of conversion pro- 
grams that allowed the movement between these formats. In the past few years 
the situation has been improved substantially by the open-source project Ob- 
sPy (http://www.obspy.org, Krischer et al., 20156; Beyreuther et al., 2010). The 
Python library ObsPy has incorporated the various data formats and allows stable 
operations on most available seismic data. 

This section has touched on the big data discussion that is currently ongoing 
in many fields. It is a rapidly evolving field and we should expect new develop- 
ments along the lines discussed above that substantially improve the practice of 
our everyday scientific work. 


3.6 The impact of parallel computing 
on Earth Sciences 


Earth Sciences and seismology in particular (with a few exceptions such as earth- 
quake physics) can be considered data-rich. The analysis of seismic data, mostly 
using approximate theories to explain observations (e.g. ray theory, ray tomog- 
raphy, and wave propagation in Cartesian and spherical layered media), has 
dominated seismology in the first decades of its relatively short history. The situ- 
ation today, with both digital data and the capacity to compare observations with 
synthetic data based on 3D simulations has already begun to, and will continue 
to, substantially improve our understanding of the structure of the Earth’s interior 
on all scales, and will thereby help us understand better how our planet works. 


This implies that most seismological research projects involve the use of 3D 
simulation software. The common practice until a few years ago for a PhD stu- 
dent or postdoc was to basically start from scratch, write a solver (e.g. using 
the finite difference method), do research with it, and submit a dissertation (and 
leave). In many cases maintaining software for later use was extremely difficult. 
This approach no longer works today, leading to the paradigm shift that I alluded 
to above: Most leading-edge problems require larger-scale simulations working 
on huge data sets, requiring substantial post-processing resources (e.g. visual- 
ization, filtering, etc.) that can rarely be developed and maintained by (small) 
research groups. As discussed, efficient IT solutions require a substantial amount 
of development time. Therefore the ongoing developments to openly distribute 
simulation software for Earth sciences (e.g. CIG, http://www.geodynamics.org) 
with sufficient documentation and training material, as well as the develop- 
ment of community platforms such as VERCE (http://www.verce.eu) or EPOS 
(http://www.epos-ip.org), should be welcomed. 

However, this situation creates new problems. While during the time of heroic 
coding of PhD students and postdocs at least the researcher knew exactly what was 
under the bonnet, now we are mostly dealing with black boxes. Today, community 
simulation codes are fairly stable and portable, and can be easily downloaded 
and run without knowledge of numerical methods or parallel computing. The 
danger is that incorrect initializations can lead to erroneous results that are hard 
to distinguish from correct solutions. 

A main goal of this volume is to turn the black box at least into a fairly transpar- 
ent box in which you have some idea of what is going on inside the codes. It is my 
firm belief and the basic concept for this volume that if you manage to write a 1D 
simulation code from scratch in whatever language (Matlab, Python, Fortran, 
C, Julia), and investigate how it works, you have come quite a long way to un- 
derstanding how a 3D code works, and you are much less likely to commit errors 
when dealing with complex simulation tasks. The substantial electronic material 
that we provide with this volume should help you in taking this path. 


Chapter summary 


e (Most) 3D wave-propagation simulation software is based on the dis- 
cretization of an Earth model on a structured or unstructured mesh. 


e The generation of meshes in particular for complicated Earth models is a 
challenging task. 


e Large-scale simulations in seismology require parallel computing, that is, 
hardware that allows the performance of several tasks at the same time, and 
software that allows the programming of the task and/or data distribution. 

e Most (time-space domain) seismic simulation codes are based on domain 
decomposition. Space-dependent fields are subdivided and distributed in 
parallel memory. 
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e Efficient parallelization of an algorithm often requires careful identification 


of the parallelizable part and specific coding with respect to the targeted 
parallel hardware. 


The efficiency of parallel software is characterized by its performance (in 
terms of percentage of peak performance) and its scalability (speed-up 
when more processors are used). 


Parallel code development in the Earth sciences should be done in close 
collaboration with computational science. 


FURTHER READING 


e There is a huge amount of information in tutorials available online on par- 


allelization using MPI. A good starting point is http://www.mpi-forum.org. 
A very comprehensive treatment of parallel computing is given in Pacheco 
(2011). 

Ismail-Zadeh and ‘Tackley (2010) contains a slightly more elaborated sec- 
tion on parallel programming in Chapter 9 with a view to problems in 
geophysical fluid dynamics. 

Durran (1999) is an excellent introduction to numerical methods for wave- 
like phenomena and dissipative flows. 


EXERCISES 
Comprehension questions 


(3.1) Describe concepts that show how to represent space-dependent functions 


in a discrete way. 


(3.2) Explain the concept of 2D, 2.5D, and 3D simulations. What problems 


can arise for <3D simulations when comparing with observations? 


(3.3) Explain the concepts of structured and unstructured meshes. Give 


examples. 


(3.4) Illustrate the differences between regular meshes in various coordinate 


systems: Cartesian, cylindrical, spherical. What are the consequences for 
simulation problems? 


(3.5) What is a cubed sphere? 
(3.6) Explain the concept of Delauney triangulation and Voronoi cells. 


(3.7) Discuss pros and cons of structured vs. unstructured grids. 

(3.8) What are adaptive meshes? Can they be used in seismology? 

(3.9) Give reasons why the generation of meshes is relevant for seismological 
problems. Give examples. 

(3.10) What are the basic models for parallel computers? 

(3.11) What is the most common model of parallelization for seismic wave 
propagation and why? 

(3.12) Explain the concepts of strong and weak scaling. 

(3.13) Find some current supercomputers on the internet and extract the 
main specifications (e.g. number of processors, memory, peak perfor- 
mance, etc.). 

(3.14) Class exercise: Every student gets an integer number starting with 0. This 
number denotes the processor. Each student writes four numbers on a 
page. Perform the following tasks: 


e Single-Instruction-Multiple-Data: The tutor tells all students to mul- 
tiply each number by 5. 


e Embarrassingly parallel problem: Add the first three numbers and 
subtract the fourth. 


e Circular shift operation: Pass the first number to your right neighbour. 
If there is none, pass it to the far left neighbour. 


© Global reduce: Find the maximum value of all initial 4 numbers of all 
processors. Processor 0 speaks out the maximum value loudly. 


© Global distribute: The tutor gives 2 numbers to processor 0. Proces- 
sor 0 distributes these 2 numbers to all processors. 


e Extend these exercises to your liking. 


Theoretical problems 


(3.15) Classify the following partial differential equations in terms of elliptical, 
hyperbolic, or parabolic problems: 


Uns + 2CUy, + C? Uy 
XUxx% Aur, = 0 (3.10) 


Uxx — 6Uy, + 12uUy = O. 


(3.16) Use Eq. 3.9 to find out what fraction of the code needs to be parallel to 
achieve a speed-up of 10,000 for 20,000 processors. 

(3.17) You want to simulate a physical domain of size (or side length) 1,000 km 
with a grid distance of 1 km. Estimate the required memory of one space- 
dependent field (double precision) in 1D, 2D, and 3D. Compare with the 
RAM of your smartphone, laptop, and with the specifications of current 
supercomputers. Discuss the results. 
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Programming exercises 


(3.18) 


(3.19) 


(3.20) 


3.21) 


3.22) 


3.23) 


Write a program (e.g. Matlab, Python) that generates arbitrary point 
clouds. Triangulate them using the Delauney method. Calculate and visu- 
alize the corresponding Voronoi cells. Find appropriate libraries to carry 
out the tasks. 

Distributed data: Write a small parallel program using (e.g. Fortran/MPI 
or pyMPI). Define a matrix A(2,000,2,000) and distribute it on n 
processors. Initialize it with random numbers and extract minimum and 
maximum values. Perform operations on the matrix in a loop and time 
the operations. Compare to the serial case m = 1 and distributed task 
parallelism. 

Task parallelism: Write a small parallel program using (e.g. Fortran/MPI 
or pyMPI). Load a seismogram trace. In one processor calculate filtered 
seismograms (e.g. looping through low-pass filters with various corner 
frequencies). In a second processor perform an equal number of subse- 
quent auto-correlations. Time the parallel and serial codes and compare 
the results. Note: If you use Python the ObsPy library offers tools for 
seismic data processing. 

Search for open-source mesh generators (e.g. MeshPy), invent some sim- 
ple geometries, and generate vtk (visualization toolkit) files and visualize 
them (e.g. with paraview). 

Install the ObsPy Python library (http://www.obspy.org). Follow the tuto- 
rials and investigate the potential to access and process observed and/or 
simulated seismic data. 

Use ObsPy to download data from any seismic station you are interested 
in for the M9.1 Tohoku-Oki earthquake of 11 March 2011. Save the 
data using the ASDF format (http://www.seismic-data.org). Explore the 
provenance options. 


Part Il 


Numerical Methods 


The Finite-Difference 
Method 


Without doubt, the finite-difference method is conceptually the simplest method 
presented in this volume. Historically it was the first numerical method that was 
widely used in seismological research. It is also justified to say that for decades it 
was the workhorse for many research applications. Despite its ‘brute force’ repu- 
tation, it is important to note that a well-designed finite-difference algorithm still is 
capable of beating some other—mathematically more sophisticated—techniques 
with better reputations. 

It all depends on the specific seismological problem. A major advantage of 
mathematical simplicity is that an algorithm is quickly adapted to a specific 
problem. In my view, that is the main reason that some of the best research 
in seismology involving seismic wave calculations in heterogeneous media has 
been achieved with this approach. This involves in particular applications in ex- 
ploration geophysics, strong ground motion problems (see Fig. 4.1), dynamic 
rupture simulations, and inverse problems. 

For students who are interested in understanding partial differential equations, 
the finite-difference method offers an efficient and fast way to develop numerical 
approximations that allow the investigation of some of the main characteristics 
of the problem. In the exercises section of this chapter, some examples that go 
beyond seismology are given. 

This chapter is structured as follows. After a brief section on the history of the 
finite-difference method in seismology we proceed with the introduction of the 
fundamental mathematical concepts. The method is first applied to the acoustic 
wave equation in 1D and 2D and the algorithm is analysed analytically. This leads 
to some of the most important concepts in numerical methods relevant to most 
other methods that are presented in this volume, such as stability, numerical dis- 
persion, and convergence. The finite-difference approximation of the 1D elastic 
wave equation in the velocity—stress form is presented, leading to the staggered- 
grid concept in both space and time. At the end of the chapter, several specific 
issues are discussed, including the implementation of boundary conditions, other 
non-standard implementations, and recent developments. 


4.1 History 


A historical view is always a compromise. For the purpose of this volume I 
only highlight a few milestones in the history of finite differences in seismology. 


Computational Seismology. First Edition. Heiner Igel. 
© Heiner Igel 2017. Published in 2017 by Oxford University Press. 


4.1 History 


4.2 The finite-difference method 
in a nutshell 


4.3 Finite differences and Taylor series 
4.4 Acoustic wave propagation in 1D 
4.5 Acoustic wave propagation in 2D 
4.6 Elastic wave propagation in 1D 

4.7 Elastic wave propagation in 2D 

4.8 The road to 3D 
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Fig. 4.1 Snapshot of horizontal ground 
motion using a 3D _ finite-difference 
method. The 1992 MS5.3 Roermond 
earthquake in the Cologne area, 
Germany, 1s simulated using a 3D 
structure of the sedimentary basin. Red 
and blue colours denote positive and 
negative horizontal ground velocity, 
respectively. The low-velocity basin 
structure amplifies motion compared to 
the surrounding bedrock and substan- 
tially prolongs shaking. From Igel et al. 


(2015). 


Fig. 4.2 Image of the massively paral- 
lel supercomputer CM-2 of Thinking 
Machines Corp. that was used in the 
early nineties for such tasks as seis- 
mic wave simulations at the Institut de 
Physique du Globe in Paris. The ex- 
ternal hard drive (the data vault, seen 


centre-right) was the size of a pub bar 
and had 20 GByte memory (compare 
with microSD cards today!). The CM- 
2 had 65,536 processors and a total 
RAM (rapid access memory) of 512 
MBytes. On such machines some of the 
first parallel finite-difference simulations 
were performed by Peter Mora. Figure 
courtesy of Tamiko Thiel. 
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An extensive review of the historic developments is given in the excellent book by 
Moczo et al. (2014). 

The first application of the finite-difference method to elastic wave propaga- 
tion can be attributed to Alterman and Karal (1968) who—interestingly enough 
at around the same time that the reflectivity method was developed by Fuchs 
and Miiller (1971)—approximated the elastic wave equation in cylindrical coordi- 
nates. This allowed wave-propagation simulations in layered media. Boore (1970) 
used the finite-difference method to simulate Love waves and was (to my knowl- 
edge) the first to show snapshots of seismic wave fields. Alford et al. (1974) car- 
ried out a thorough analysis of the finite-difference approximation to the acoustic 
wave equation, comparing numerical results with analytical solutions. This was 
followed by applications to the elastic wave problem by Kelly et al. (1976). 

The now widely used concept of staggered grids was introduced to solve 
the problem of rupture propagation (Madariaga, 1976; Virieux and Madariaga, 
1982), and later adapted to the problem of elastic SH and P-SV wave propagation 
in 2D (Virieux, 1984; Virieux, 1986), respectively. High-order operators improv- 
ing accuracy were presented by Levander (1988). He also introduced the concept 
of stress imaging to implement the free-surface boundary condition. 

The evolution of numerical methods applied to wave propagation was tightly 
linked to the development of computational hardware. A major step towards 
realistic simulations was possible with the invention of parallel computers (an 
example is shown in Fig. 4.2). The CM-2 was tailor-made for finite-difference 
algorithms as it favoured near-neighbour communication. Increasing computer 
power led to the extension of the method to 3D, with initial applications by 
Frankel and Vidale (1992), Graves (1993), Olsen and Archuleta (1996), and 
Pitarka and Irikura (1996). Other rheologies such as viscoelastic behaviour (Day 
and Minster, 1984; Emmerich and Korn, 1987; Robertsson et al., 1994) and 
anisotropic material (Igel et al., 1995) were incorporated shortly thereafter. 


The finite-difference method in a nutshell 


Finite-difference methods were applied to the problem of global wave prop- 
agation with formulations in spherical coordinates, first with the axisymmetric 
approximation (Igel and Weber, 1995, 1996; Chaljub and ‘Tarantola, 1997) and 
for 3D spherical sections (Igel et al., 2002). The implementation of frictional 
boundary conditions in 3D finite-difference algorithms (Olsen et al., 1997) hada 
strong impact in the field of dynamic rupture analysis. Nielsen and ‘Tarantola 
(1992) presented a finite-difference scheme for wave propagation in which a 
threshold criterion would allow any node to fail G.e., break or rupture). 

Despite the difficulties of implementing (e.g. free-surface) boundary con- 
ditions, methods were developed and tested to allow the simulation of strong 
topographies, for example in connection with volcano seismology (Ohminato and 
Chouet, 1997) and for other rheologies (Robertsson and Holliger, 1997). Moczo 
et al. (2002) developed strategies for the accurate simulation of waves through 
strongly heterogeneous media. 

At an early stage finite-difference simulations were incorporated in full wave- 
form inversion schemes, initially in 2D, as with Crase et al. (1990), and later in 
3D (Chen et al., 2007). Particularly for exploration-type problems, the finite- 
difference method is now widely used for 3D inversion as body wave propagation 
with finite differences is highly efficient. For a review see Virieux and Operto 
(2009). 


4.2 The finite-difference method 
in a nutshell 


The finite-difference method is the classic example of a grid method. For space— 
time dependent problems, such as the equation(s) describing the propagation of 
seismic waves, space and time are both discretized (usually) on regular space— 
time grids. This implies (unlike series expansion methods) that the values are 
only known at these points. In their simplest form, the partial derivatives in the 
original equations are replaced by the well-known finite differences that use the 
function values at adjacent grid points to approximate the function’s derivatives. 
The acoustic wave equation in 1D with constant density 


87 p(x, 2) = c(x)?d2p(x, 2) + s(x, 2) (4.1) 


with pressure p, acoustic velocity c, and source term s contains two second 
derivatives that can be approximated with a difference formula such as 


p(x, t + dt) — 2p(x, t) + p(x, t— dt) 
dt? 


dP t) © (4.2) 
and equivalently for the space derivative. Injecting these approximations into the 
wave equation allows us to formulate the pressure p(x) for the time step t+ dt (the 
future) as a function of the pressure at time ¢ (now) and t-— dt (the past). This is 
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Fig. 4.3 Principle of the _ finite- 
difference method illustrated with a 
simulation of acoustic waves in 1D 
through a medium with random velocity 
perturbations. Bottom: Snapshot in 
space of the pressure field p(x, t). Top: 
Close-up of a detail of the wave field 
with the grid points indicated by dots. 
The finite-difference method 1s based on 
the calculation of partial differentials 
with formulas as indicated in the figure 
using values in the neighbourhood Piss 


appropriately weighted. The indices 1,7 
denote discrete space and time levels, 
respectively. Here a three-point opera- 
tor 1s used to approximate the second 
derivatives in space (and time). 


1 Implicit schemes are algorithms where 
the state at t + dt depends on values at 
t + dt leading to a system of equations 
that needs to be solved at each time step. 
Such schemes are sometimes more stable 
but are not necessarily more accurate. For 
the wave equation implicit schemes play a 
minor role and are therefore not discussed 
here. 


called an explicit scheme allowing the extrapolation of the space-dependent field 
into the future only looking at the nearest neighbourhood.! Actually, it is this near- 
neighbour communication that makes the finite-difference method so efficiently 
parallelizable on modern supercomputers. 

The schema is illustrated in Fig 4.3 showing a snapshot of a 1D wavefield. The 
values at three adjacent grid points are used to calculate the second derivatives 
necessary to extrapolate to the next time step. 

The finite-difference method is very popular and widely used, primarily 
because of its simplicity and the easy adaptation to parallel hardware. A disadvan- 
tage of the finite-difference method is the relative difficulty with implementation 
of sufficiently accurate and stable boundary conditions, especially the non-planar 
free surface. Therefore the finite-difference method plays an important role in 
exploration geophysics where surface waves are often considered noise. Recent 
developments (see (Moczo et al., 2014)) indicate that accurate implementations 
of the free surface with modified operators or hybrid schemes are possible. 


4.3 Finite differences and Taylor series 


Our first problem is to find a way to calculate derivatives of functions that are de- 
fined on regularly spaced grid points. The finite-difference method is based on the 
definitions of the derivative of a function f(x) as the limit of the distance between 
two functional values defining the slope approaches zero. As is obvious from the 
definitions below, there are three different ways of introducing the derivative; the 
forward derivative, 


Ase) = Jin as 
the centred derivative 
dette) = tim, fet #) : fe = dx) an 
and the backward derivative 
Oi 2S. (4.5) 


dx—>0 dx 


The equal sign and the same naming is justified here as in the limit the deriva- 
tives are equal (provided that the function f(x) is continuous and smooth around 
x). The finite in the difference method originates in not taking the limit but 
keeping a finite dx. As a consequence we obtain three definitions; the forward 
difference denoted 


+ gy LO+ de) -$@) 


d, 
of dx 


(4.6) 


the central difference 


ae f(x + dx) —f(x- dx) 


d, 4.7 
Sf dy (4.7) 
and the backward difference 
- -d. 
ap x LO ends) | Aes 


dx 


The approximate sign is important here as the derivatives at point x are not exact. 
One of the most fundamental problems in numerical analysis is always to quantify 
how accurate numerical derivative operations are,” and we will devote a substantial 
part of this chapter to this question. For the moment we will restrict ourselves 
to understanding the accuracy of finite-difference-based derivatives in the space 


domain. Let us have a look at the definition of Taylor series:? 


S(xtdx) = f(x) +f (dx + tf" (dx? + O(dx*), (4.9) 
where O(dx?) denotes the remaining error term with the leading order 3. Sub- 
traction with f(x) and division by dx leads us right away to the definition of the 
forward derivative given above 


f(x + dx) -f(*) 


=F = f (x) + O(dx), 


(4.10) 


but also provides us with the leading order of the error term, which in this case 
is 1. Also written as O(dx). Surprisingly, using the same approach, adding the 
‘Taylor series for f(x + dx) and f(x— dx) and dividing by 2dx leads to 


f(x + dx) —f(x- dx) 


at op 2 
Fie Ff («) + O(dx*) 


(4.11) 


with an error term O(dx*)! This implies that a centred finite-difference scheme 
converges more rapidly to the correct derivative on a regular grid, if one reduces 
the grid spacing dx. At this point some important conclusions can be drawn. First, 
it does matter which of the above formulations one uses to approximate a deriva- 
tive. Second, this does not imply one or the other finite-difference approximation 
is always the better one. This will become clear when playing with these defini- 
tions and various simple partial differential equations. The message is that one 
always has to verify the overall accuracy of a solution scheme. It might even be 
that one of the approximations never leads to an accurate solution!* 


4.3.1 Higher derivatives 


What about higher derivatives? The partial differential equations we are deal- 
ing with often (not always) have second (seldom higher) derivatives. Let us start 
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? In his most famous speech Abraham 
Lincoln said: “Whatever differs from this 
to the extent of the difference (...)”. 
Much of numerical analysis is about the 
extent of the difference. How much inaccu- 
racy can we afford in the solution of our 
problem? 


3 Space derivatives: 
aS (x) > f(x) 
af (x) > Sf") 

+ Examples are problems with advection 
terms, e.g. flow problems. Here, for the 
advection term so-called upwind deriva- 
tives are used, which are realized by non- 
centred derivatives. 
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by using the definitions for the first derivatives developed above. Taking the 
derivative of those terms mixing a forward and a backward definition leads to 


f@tdx)fX) _ fC)=f dx) 
act ~ dx dx 
dx (4.12) 
f(x + dx) — 2f (x) + f(x- dx) 
dx? 


and the order of the leading error term can be determined in a straightforward 
manner (see exercises). Let us introduce a more general way of determining the 
weights with which the function values have to be multiplied to obtain deriva- 
tive approximations. This schema will prove useful for finding more accurate 
differential operators. 

We write down the Taylor series for 2 grid points at x + dx, include the function 
at the central point as well, and multiply each by a real number. 


af(x+dx) = a ie +f (det Lf" (de? ++ | 
bf(x) = 6 [f@)] (4.13) 
of (x— dx) aie —f' (dx + if" @ae--.-]. 


Our goal is to find solution(s) for coefficients a, b,c such that the sum over the 
weighted functional values leads to approximations of the second derivative. First, 
we sum up the above equations, drop higher-order terms, and rearrange, to obtain 
af (x + dx) + bf (x) + of(x-dx) * 
f(x) [a+ b+ ¢] 
+dxf [a -d 


+hdxf" [a + cj. 


(4.14) 


Looking at these equations it is easy to see that to obtain an approximation of the 
second derivative we require 


atb+c=0 

a ere (4.15) 
2! 

‘. 7 dx2 


This is a linear system of equations that we can solve with standard linear algebra 
tools. Casting this in matrix form we get 


11 1\ (a 0 
10-1]}o]=] 0 
10 1) \e = 16) 


A w = $s 


defining A as the system matrix, w as the vector with the differential operator 
(weights), and s as the vector specifying the desired solution. It turns out the 
square matrix so defined is invertible and we find the operator weights by 


w=A's (4.17) 
and obtain 
A 
oS dx? 
2 
b = -—~ 4.18 
Fa (4.18) 
<i 
~ dx? 


which is the approximation to the second derivative in its lowest-order form, as 
presented in Eq. 4.12.° It is instructive to investigate the behaviour of these deriva- 
tive approximations for various functions and explore the accuracy as the number 
of grid points per wavelength decreases (see exercises). 

The way we derived the differential operator by formulating a linear system of 
equations is actually very powerful. We can use it to derive more accurate opera- 
tors by looking at grid points further away from the point at which the derivative 
is calculated. 


4.3.2 High-order operators 


What happens if we extend the domain of influence for the derivative(s) of our 
function f(x)? Does it allow us to improve the accuracy of the derivative approx- 
imations? From the solution scheme above we expect to have to define a square 
linear system with as many unknown coefficients as Taylor expansions. As an 
example, let us search for a five-point operator for the second derivative. We seek 


f(x) © 


af (x + 2dx) + bf (x + dx) + of (x) + df (x— dx) + ef (x- 2d»), (4.19) 


and sum up the corresponding ‘Taylor series up to order 4 obtaining a linear 
system of equations for the coefficients 


a+b +c +d +e=0 
2a +6 -d —2e=0 
1 
4a +b +d +4e = —— 4.20 
: 7 2dx? ( ) 
8a +6 -d —8e = 0 
l6a +6 +d +16e = 0, 
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> Note that the weights for the sec- 
ond derivative are symmetric re. the cen- 
tral point, whereas the weights for the 
first derivative are antisymmetric. In other 
words the information at the central point 
is not contributing to the first derivative! 
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Fig. 4.4 Graphical illustration of high- 
order Taylor operators for the first 
derivative. The derivative is defined in 
between the central grid points (white 
square). The differential weights (black 
squares) rapidly decrease with distance 
from the central point of evaluation. 


after multiplying each line in order to obtain integer coefficients. Using matrix 
inversion we obtain a unique solution 


7 1 
Me ys te 
4 
b =- —_ 
3dx? 
5 
c= 555 (4.21) 
4 
d —_ 
3dx2 
7 1 
~  12dx2° 


The leading error term for the second derivative using these difference weights 
is O(dx*) (see exercises). This accuracy improvement is indeed substantial and 
from a practical point of view such a five-point operator (or equivalently four- 
point operator for the first derivative) should always be preferred. 

In principle—using the linear system above—we could also seek the corre- 
sponding weights for a first or third derivative (or an interpolation) by simply 
changing the solution vector on the right-hand side. We note—and for two-sided 
operators this is general—that the modulus of the coefficients decreases with 
increasing distance from the central point (see illustration in Fig. 4.4). 

As will be explained below, one of the most widely used concepts in connec- 
tion with finite-difference solutions to the elastic wave equation is so-called grid 
staggering. With this method—either in space or time—the derivative is calculated 
half-way between two points on a regular spaced grid. We have just documented 
that the number of points used for the calculation of numerical derivatives (we 
also say the length of the finite-difference operator) is directly related to the accu- 
racy of the derivative. Obviously, this comes at the price of having to undertake 
more floating-point operations. Finding the right balance between computational 
effort and mathematical accuracy is at the heart of numerical analysis. It also 
depends on the specific hardware a computer program is implemented on. 

As far as finite-difference operators are concerned, it turns out that in the case 
of first space derivatives four-point operators are the most widely used (and five- 
point operators for second derivatives). The accuracy improvement from second 
to fourth order is substantial. Using longer operators usually does not pay off. 
The mathematical approach presented in this section can also be used to de- 
velop one-sided high-order differential operators for boundaries (see exercises). 
It is instructive to compare high-order Taylor operators with those obtained using 
Fourier transforms (see chapter on the pseudospectral method). 


4.4 Acoustic wave propagation in 1D 


Let us now proceed to find numerical solutions to the wave equation. The 
finite-difference method allows a fairly simple analysis of the basic properties of 


numerical solutions for wave equations that are quite general. Therefore, we start 
with the simplest wave equation—the constant density acoustic wave equation— 
in 1D. In terms of physics, this equation might describe pressure waves in a gas or 
stationary fluid. A snapshot example of acoustic waves in a homogeneous medium 
is shown in Fig. 4.5. With slight modifications this scalar wave equation is also 
descriptive of the vibrations of a string. 

Using dense notation and omitting the spatial and temporal dependencies the 
scalar acoustic wave equation in Cartesian coordinates can be written as 


a7p = Ca2pts, (4.22) 
imposing pressure-free conditions at the two boundaries as 


P(x) |x=0n = 0. (4.23) 


The following dependencies apply: 


p => p(x, pressure 
c —> c(x) P-velocity 
s —> s(x,) source term. 


As a first step we discretize space and time with constant increments dx and dt. 
Thus 


xj = jdx, J = 0, Imax 


4.24 
t, = ndt, n=0, Mnax- ( ) 


The choice of these space and time increments is crucial and we will discuss 
this in more detail when we are able to forecast the consequences for our wave- 
propagation problem. We now make the step from the continuous description 
of the partial differential equation to a discrete description using the indices 
introduced above. From now on, the upper index will correspond to the time 
discretization, and the lower index (or indices) will correspond to the spatial 
discretization, for example 


D(X tn + at) 
P(X;> tn) 
P(Xjo tn — At) 
D(x; + dx, ty) 
D(Xjs tn) 

D(x; = dx, tn). 


(4.25) 


& 
en a an 


This discretization implies that, when describing the space-dependent fields on 
the computer, we will initialize arrays with the corresponding dimensions (Gn 1D 
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Fig. 4.5 Acoustic wave simulation. 
Pressure waves radiate 1sotropically 
from the central point. A sinusoidal 
source time function was applied. 
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Fig. 4.6 Illustration of the space-time 
scheme of the finite- 
difference algorithm for the 1D acoustic 


discretization 


wave equation. The x-axis corresponds 
to space, the y-axis to time. The open cir- 
cle denotes the point p(Xj tn+1) to which 
the state of the pressure field is extrapo- 
lated. Such a space-time operator is also 
called a stencil. 
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Fig. 4.7 Dirac delta function. Dis- 
cretized Gauss functions converging to- 
wards the Dirac delta function as the 
halfwidth is decreasing. The integral 
remains constant and is unity (see 
Eq. 4.32). 


column vectors) and are able to address the values at certain locations and their 
neighbours by means of their indices. 

With the definitions of the derivatives introduced in the previous section we 
can replace the partial derivatives in Eq. 4.22 to obtain 


pp 2p teh a [Pi Pe FP) 

, - 4[ tr |+s: 

Note that the right-hand side is defined at the same time level 7”, at which in- 
formation around the grid point 7 is used to calculate the spatial derivatives (see 
Fig. 4.6). The calculation of the time derivatives on the left-hand side requires 
information from three different time levels. Assuming that the information at the 


(4.26) 


levels n (the presence) and m—1 (the past) is known, we can solve for the unknown 
future field pt 
2 


nt+1 _ 2 


pa G as [pf.1 - 20? + p21] + 207 -pr + dt’ st. (4.27) 


In practice, we will loop over time and, at each time level n, calculate the 
space derivatives using the information from neighbouring grid points (i.e. loop 
over space). The initial condition of our wave simulation problem is such that 
everything is at rest at time ¢ = 0: 


pi,t=0) = 0, dp(x,t=0) = 0. (4.28) 


Waves radiate as soon as the source term s(x, f) starts to act. Let us have a closer 
look at the spatial and temporal form of the source term. In many situations (seis- 
mic exploration, earthquake seismology) sources of seismic waves are treated as 
points (later we will consider the case of finite-sized sources). In general, a geo- 
physical problem requires the source point to be at a certain location that might 
not coincide with a grid point. For the sake of simplicity let us assume for the 
moment that the source acts directly at a grid point with index j,. 

What about the temporal behaviour of the source? In fact, from a physical 
point of view it is often useful to calculate the Green’s function, that is, the 
response to an impulse in both time and space of the form 


s(x, 1) = 6(x-x,) 6(t-4,)5 (4.29) 


where x, and ¢, are source location and source time, respectively, and 6(.) corre- 
sponds to the Dirac delta function (see Chapter 2 for analytical solutions). While 
the injection of a spatial point source can in principle be realized (see section 
below), a delta function in time cannot. A delta function contains all frequen- 
cies (white spectrum) and we cannot expect that our numerical algorithm will be 
capable of providing accurate solutions in this case (see discussion of this prob- 
lem in Chapter 2). Therefore, we will operate with a band-limited source time 
function 


s(x, t) = d(@«-x;) f@ (4.30) 


where the temporal behaviour f(t) is chosen according to our specific physical 
problem. 

The correct implementation of sources in a numerical solver such that the 
results converge to the analytical solution requires some care. Referring to the 
discussion of the analytical solutions in Chapter 2 a delta-like point source can 
be implemented, provided that its spatial integration leads to unity. This can be 
achieved with appropriately scaled functions that converge to the delta function. 
One possibility is the boxcar function 


l/dx |x| < dx/2 


Sbe(X) = 
acl) 0) elsewhere 


(4.31) 


scaled by the width of the box (here: the grid point distance dx, in general the 
grid cell volume). For an illustration see Fig. 2.8. This function solves the scaling 
issue for the spatial source function. What about time? 
Another function that, in the limit a > 0, converges to a delta function is the 
Gaussian 
2 


e 2a (4.32) 


da(t) = = 
with the required integration property (see exercises in Chapter 2). An illustration 
is given in Fig. 4.7. It is important to test numerical schemes against the analytical 
solutions with respect to proper source scaling (see supplementary material). 
Let us discuss an example. Suppose we want to simulate acoustic wave prop- 
agation in a 10km column (e.g. the atmosphere) and assume an air sound speed 
of c = 343 m/s. We would like to hear the sound wave so it would need a domi- 
nant frequency of at least 20 Hz (at the bottom of the audible frequency range). 
For the purpose of this exercise we initialize the source time function f(¢) using 
the first derivative of a Gaussian function (because we are aware that in 1D the 
resulting signal is an integral of the source time function and we want a Gaussian 
waveform) 


(19)? 
Bb) 


FO = 8h tt) 6 WO (4.33) 


where tg corresponds to the time of the zero-crossing, and fo is the dominant 
frequency. The source time function and its amplitude spectrum are illustrated in 
Fig. 4.8. 

At this point we have to choose the spatio-temporal setup of the simulation and 
this is fundamental to the solution of all wave-propagation problems, independent 
of the specific numerical methods. The following questions have to be answered 
carefully before starting any simulation: 


e What is the minimum spatial wavelength that propagates inside the 
medium? 
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Fig. 4.8 Source time functions: Top: 
Frequently used source time functions in 
wave propagation are Gaussian func- 
tions (and their derivatives). Here, a 
source time function (first derivative of 
a Gaussian) with a dominant frequency 
of 20 Hz is shown. Bottom: Ampl- 
tude spectrum of the source time function 
given above. 
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Fig. 4.9 Simulation results (snapshots 
in space) for the 1D acoustic wave 
equation at various propagation dis- 
tances (given as number of dominant 
wavelengths propagated). In theory, the 
Gaussian shaped signal should propa- 
gate undistorted for ever. However, the 
finite-difference discretization results in 
the signal becoming dispersive and it dis- 
integrates with time (or distance). The 
propagation direction 1s indicated by an 


arrow. 


e What is the maximum velocity inside the medium? 


e What is the propagation distance of the wavefield (e.g. in dominant 
wavelengths) ? 


In order to answer these questions it is sufficient to look at the relation between 
frequency and wavenumber 


= Xf; (4.34) 


bE 
where c is velocity, T is period, A is wavelength, f is frequency, and w = 2nf 
is angular frequency. We have chosen a source time function with a dominant 
frequency of fo = 20Hz. From Fig. 4.8, however, we can see that a substantial 
amount of energy in the wavelet is at frequencies above 20 Hz. 

For the given velocity the corresponding wavelengths are ) = 17 mand’ =7m 
for frequencies 20 Hz and 50 Hz, respectively. For this exercise we choose a grid 
increment of dx = 0.5m which would result in about 34 points per spatial wave- 
length for the dominant frequency. The time increment will be set to dt=0.0012s 
corresponding to ~40 points per dominant period. Restrictions of the choice of 
the space-time discretization will be discussed in the next section. 

In the following a Python code fragment is presented with the core of the 
finite-difference algorithm: 


# Time extrapolation 
for it in range(nt): 


# calculate partial derivatives (omit boundaries) 


for i in range(1, nx - 1): 
d2p[i] = (pli + 1] - 2 * plil\ 
+ pli - 1]) / Ax ** 2 


# Time extrapolation 
2 * p - pold + dt ** 2 * c ** 2 * d2p 
# Add source term at isrce 


pnew = 


pnew[isre] = pnew[lisrc] + dt ** 2 * src[it] / dx 


# Remap time levels 
pold, p = p, pnew 
Here, vt is the maximum number of time steps, 7x = 20,000 is the number of 
grid points for x, zsrc is the grid point at which the source is injected, and src[it] is 
the source time function scaled by the grid increment. For the sake of simplicity 
the acoustic velocity c is kept constant. The results of this simulation example at 
various time steps (i.e. propagation distances in terms of dominant wavelengths) 
are shown in Fig. 4.9. 

In theory, the integral of the injected source time function (thus a Gaussian) 
should propagate for ever without distortion. This is not what we observe in our 
numerical simulation. After propagating about 100 wavelengths, the waveform 


begins to disintegrate and forms a distinctive tail. This phenomenon is called 
numerical dispersion and is an intrinsic feature of almost all numerical approx- 
imations to wave equations. The velocity of the wavefield becomes frequency 
dependent as a function of the discretization. Of course this phenomenon needs 
to be avoided. The choice of the right set-up is the central task for any seismic 
simulation experiment. In the next section we shed light on these phenomena 
using analytical tools. 

How can we analyse our numerical approximation to the wave equation and 
compare with the analytical solution in a quantitative way? This question leads us 
to the so-called von Neumann analysis.° 


4.4.1 Stability 


We start with the definition of a plane complex harmonic wave for pressure p 
propagating in x-direction with wavenumber & and angular frequency w 


p(x, t) = bed, (4.35) 


In our finite-difference approximation to the wave equation we introduced the 
spatio-temporal discretization as 


xj —> jdx 
(4.36) 
t, — ndt 
which we will use in the plane-wave formula such that 
pr +> el (Ridx—wndt) 
J 
pn a el RG+ Ddx-ondt) 

gel = 

<= elhdx ot (kidx—wndt) (4.37) 


= eikdx pn 

J 

n+1 _ -iwdt yn 
Pi = ep. 


With this discrete plane-wave trial solution we can enter the (source-free) finite- 
difference approximation of the acoustic wave equation (Eq. 4.22), replacing all 
terms in Eq. 4.27 with the terms above to obtain 


| ae 
givdt + giodt 2 ge C2 prea (etka + oe tkdx _ 2), (4.38) 
dx? 


where we divided by the term p} on both sides. Using the definition 


Lips ‘ 
cosx = 5 (e* +e") (4.39) 


we get 


dt? 
Stodiat = 2S (cos(kdx) = 1), (4.40) 


and with the trigonometric relation 
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® John von Neumann (1903-1957) was 
a mathematician of Hungarian origin who 
is considered (amongst other things) as 
one of the fathers of quantum mechanics, 
game theory, and computational science. 
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Fig. 4.10 Stability. Acoustic simulation 
in 1D with a CFL criterione = 1.0001. 
Initially the simulation runs fine but 
then starts to explode in the centre of 
the domain with exponentially growing 
numbers. Try this out with computer 
practicals. It is fun! 


’ The stability criterion is the most im- 
portant concept to remember. Its principle 
is essential for the planning of simulation 
tasks. Note that it is dimensionless and is 
essentially the ratio of a physical velcity c 
and a grid velocity dx/dt. 


x 1-—cosx 
sin= = +,/ ——— (4.41) 
2 2 
finally arrive at 
: at dt, ax 
sin (w 5 ) oF sin (Rk 5 ). (4.42) 


Note that here we assume true physical wavenumber resulting in a grid-dependent 
frequency (and thus numerical wavenumber). From this relation one of the most 
important conclusions in numerical analysis can be drawn. The equation only has 
real solutions in general when the connecting term has the following property 


(4.43) 


as the sive terms on both sides have values in the interval [—1, 1]. If this condition 
is not met the solution will explode and is called unstable. This stability condi- 
tion is also called the Courant—Friedrichs-—Lewy (CFL) criterion (named after 
three scientists who described it in a paper in 1928).’ This result has important 
consequences: 


e The space-time discretization cannot be arbitrarily chosen but depends on 
the medium properties (here: velocity c). 


e The CFL criterion describes the conditional stability that leads to convergent 
behaviour of the solution. 


e As the space discretization (increment dx) is often imposed by considering 
the smallest seismic velocities in the medium and the highest frequencies 
(i.e. the shortest wavelengths), the CFL criterion determines the time incre- 
ment dt and thus the number of time steps to achieve a certain simulation 
length. 


e The actual value (in this case 1) that has to be respected depends on the 
number of space dimensions and the overall algorithm. 


e It is important to note that the fulfilment of the CFL criterion is necessary 
but not sufficient to guarantee an accurate simulation! 


Equation 4.42 allows us to investigate the relationship between wavelength and 
frequency in our numerical realm in more detail. 


4.4.2 Numerical dispersion 


Following equation 4.42 the angular frequency w can be expressed as 


2°. al dt . $| 
o= sin | c— sink ; 
dx 


44 
dt 2 oy 


The phase velocity c that—in theory—should be identical to the acoustic velocity 
can be obtained by dividing the above equation by wavenumber k 


45 
dx 2 Ce 


c(k) = ; = + sin! E 
The relationship explains the observations we documented above in our first nu- 
merical example. The phase velocity is no longer independent of wavenumber as 
was the case in the original acoustic wave equation. Our numerical approximation 
leads to (unphysical!) dispersion. 

This is illustrated in Fig. 4.11 where the phase velocity is shown as a function of 
the number of grid points per wavelength.® This figure has an important message. 
Unless a sufficient number of grid points per (dominant or minimum) wavelength 
is used, the wavefield is strongly dispersive and leads to inaccurate results. 

1 in the 
acoustic 1D example. In that case you would not observe any dispersion! While 


Note that a curious situation occurs if you set the CFL limit « = 


this sounds interesting, it is not really significant for practical applications (re- 
member we are using numerical methods to be able to simulate wave propagation 
through heterogeneous material, so the CFL limit varies in space). 

In many publications—and particularly when new numerical approaches 
appear—there are statements about the number of grid points to be used for 
accurate simulations. By themselves, such statements are not meaningful as the 
accuracy of a simulation also depends on the overall propagation distance. The 
longer the propagation distance, the more the errors accumulate, and the more 
grid points per wavelength have to be used. 


4.4.3 Convergence 


We introduced finite-difference operators with the idea that when the finite dif- 
ference becomes infinitesimally small the analytical derivative is recovered. Does 
that hold for our numerical approximation of the acoustic wave equation? We can 
answer this question by careful inspection of Eq. 4.45 using 


E 
& 
1?) 
2 
Vv 
a 
9 . 
s 
SS 1,7008f |r 2€=0.8 
|| rn 2¢€=—0.9 
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Number of grid points per wavelength 
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8 This is more instructive than showing 
the wavenumber. The Nyquist wavenum- 
ber ky=z/dx corresponds to 2 grid 
points per wavelength, dx being the grid 
increment. 


Fig. 4.11 Numerical dispersion. The 
numerical phase velocity is shown as 
a function of number of grid points 
per wavelength for various CFL cri- 
The left limit corresponds to 
two points per wavelength (Nyquist 


teria. 


wavenumber). The correct propagation 
velocity is 2,000 m/s. As the number of 
grid points per wavelength increases the 
correct velocity is recovered. 
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Fig. 4.12 Regular grid with horizon- 
tal increment dx and vertical increment 
dz equal. This is not a requirement but 
differences in these increments result in 
stronger directional dependencies of the 
errors. In a regular grid propagation at 
an angle of 45° with respect to the coor- 
dinate axes is most accurate (see text). 


sinx © x for small x 
(4.46) 


sinbx © x for small x. 


It is easy to show that as the spatial increment dx and the time increment dt 
converge to zero the original dispersion relation 


lim c(h) = 2 = cexact (4.47) 
ge +0 k 


is recovered, demonstrating the convergence of the numerical algorithm (homoge- 
neous case) to the analytical solution. 


4.5 Acoustic wave propagation in 2D 


Even though the same principles should apply in higher dimensions we will con- 
sider here the acoustic wave equation in 2D as it allows us to investigate the 
behaviour of the errors as a function of propagation direction. Also, in practice the 
2D acoustic algorithm (see supplementary material) already allows many useful 
practical exercises with interesting model set-ups. 

In 2D the constant-density acoustic wave equation is given by 


a? (x, Bs t) = C(x, 2)° (02 p(x, Bs t) + a2 p(x, Bs t)) + S(X5 Zt), (4.48) 


where the z-coordinate is chosen because in many applications the x — zg plane is 
considered a vertical plane with z as depth coordinate. In accordance with the 
above developments we discretize space-time with 


D(%> 250) > pi, = p(ndt, jdx, kdz). (4.49) 
Using the three-point operator for the second derivatives in time leads us to the 
extrapolation scheme 
1 


n+1 n n— 
Pie Pine + Pine 
dt? 


= GOlp+ dp) +s%45 (4.50) 


where on the r.h.s. the space and time dependencies are implicitly assumed and 
the partial derivatives are approximated by 


p= Pinta ~ 2Pik + Pia 

: n a n (4.5 1) 
425 -< Pies ~ 2Pjn + Piet 

* dz? 


Note that for a regular 2D grid dz = dx (see Fig. 4.12). Let us set up an exam- 
ple and investigate the behaviour of the wavefield. We want to simulate P-wave 


a: f = 20 Hz b: f = 35 Hz 


ec: f = 50 Hz d: f = 65 Hz 


propagation in a reservoir scale model with maximum velocity ¢nyg, = 5km/s 
and minimum velocity ¢yj. = 3 km/s. The dominant frequency is chosen to be 
20 Hz (from the discussion of the source time function above we expect energy 
up to 50 Hz to be present in the waveforms). The dominant wavelength is Agom = 
C/fiom = 150m. For this exercise we simulate a spatial domain of 5km x 5km and 
use a grid point distance dx = 10m resulting in 15 grid points per wavelength for 
the dominant frequency. We examine the behaviour of the wavefield looking at 
the snapshots resulting from a source injected at the centre of the model. 

The results are shown in Fig. 4.13 after a propagation time of © 0.5s. For 
acoustic wave propagation we expect an isotropic radiation pattern, that is, iden- 
tical waveforms propagating in all directions. While this is indeed the case for the 
example with 20 Hz, for higher frequencies we can observe that the wavefield be- 
comes anisotropic in the sense that in certain directions the wavefield deteriorates 
faster. This effect is called numerical anisotropy. In order to understand it we shall 
take another look at the numerical dispersion equation, but this time in 2D. 


4.5.1 Numerical anisotropy 


Can we understand the observed artificial anisotropic behaviour using our analyti- 
cal approach? We start with the description of a plane harmonic wave propagating 
in 2D with wavenumber vector k pointing in the direction of propagation 


D(X, t) = ei (Rx-wt) = el Rextke2-wt) (4.52) 
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Fig. 4.13, Numerical anisotropy: Snap- 
shot of acoustic wavefield for various 
dominant frequencies keeping all other 
parameters constant: a: 20 Hz; b: 35 Hz; 
c: 50 Hz; d: 65Hz. While at 20 Hz 
an isotropic wave field can be observed, 
at higher frequencies (fewer points per 
wavelength) numerical dispersion ap- 
pears first in the direction of the grid 
axes. This propagation direction depen- 
dent effect ts called numerical anisotropy. 
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Fig. 4.14 Numerical anisotropy. The 
error of the phase velocity is shown as a 
function of propagation direction (in %) 
for varying numbers of grid points per 
wavelength. The directional dependence 
is the same but the error increases as 
the number of grid points per wavelength 
decreases. 


and use the discretization of our 2D problem to obtain 


Phy = el (Ruidxtkzkdx—wndt) : (4.53) 


Substituting this formula with the pressure field of the source-free 2D acoustic 
wave equation (Eq. 4.50) and following the same steps as done for the 1D nu- 
merical dispersion analysis leads to the following relation for the numerical phase 
velocity in 2D (assuming a regular grid dz = dx) 


2 dt? hyd. Red 
cl" (Ry, kz) = aT sin’! Fea (sin( =) + sin ( =))] 7 (4.54) 


This relation can be analysed as a function of propagation direction noting that 


Ry |k| cosa 
R= = 35 
(*:) (Rese — 


and the results are shown in Fig. 4.14. The error of the phase velocity (com- 
pared to the true acoustic velocity) is shown as a function of propagation direction 


and varying number of grid points per wavelength. The anisotropic behaviour is 
striking. For m, = 5 the error varies from 1.8% to 5%, with the most accurate 
propagation direction at an angle of 45° to the coordinate axes. This directional 
behaviour is independent of the number of grid points per wavelength, but of 
course the absolute error decreases with increasing n,. 

Again there is a strong message in this observation and the analytical result. 
Seismic observations provide evidence for physical anisotropy on almost all scales. 
While the effects on the travel times are usually small, any errors we are commit- 
ting in our numerical simulation map into errors in our resulting Earth model (in 
terms of tomographic inverse problems). Therefore, we must be careful to choose 
simulation parameters to ensure that we avoid numerical anisotropy. 


4.5.2 Choosing the right simulation parameters 


With the 2D acoustic algorithm we now possess a simulation tool that, despite 
its simple physics, allows us to investigate many wave-propagation phenomena 
relevant for seismology. Therefore, we can use this example to highlight the 
preparatory thinking that has to be done prior to any simulation task. We illustrate 
this with an example from earthquake seismology: the simulation of fault-zone 
trapped waves (see Fig. 4.15). Note that the scalar acoustic wave equation is 
mathematically identical to the SH-wave-propagation problem (assuming con- 
stant density). So, in the case of fault-zone trapped waves that are predominantly 
observed for SH-type ground motions, this is a useful basic physical model. 

Let us investigate the effects of a narrow 200 m-wide fault zone with a 25% 
velocity decrease. In accordance with observations the target frequency is 10 Hz 
(dominant, maximum 30 Hz). We expect that for a seismometer sitting at the 
top of the fault zone a seismogram length of t,,,., = 3.58 will be sufficient to 
observe trapped waves. The parameters of the physical model are summarized in 


Table 4.1 Earth model set-up for the fault zone simulation example. 


Description Value 
Sosfinax Dominant, maximum frequency 10 Hz, 30 Hz 
CyiinsCmnax Min., max. velocity 2250 m/s, 3,000 m/s 
Nias Bie aie Min., max. extension 10km, 10km 
bias Seismogram length 3.58 


Table 4.1. To map this physical model on a computational grid and initialize a 


simulation we must answer the following questions: 


What is the smallest wavelength propagating in the medium? 
Answer: With the fing, = 30 Hz and the smallest velocity cy, = 2250 m/s the 
shortest wavelength is given by Ajnin = Cmin/fmax = 75M. 


How many grid points per smallest wavelength are required by the nu- 
merical method (given the wave-propagation distance that needs to be 
covered) ? 

Answer: The dominant wavelength in the low-velocity medium is Aggy, = 
Cmnin/ fiom = 225m. Thus we expect to propagate more than 20 wavelengths 
to the surface. As we use a five-point operator we choose 20 points per 
dominant wavelength, resulting in about 7.5 points per smallest wavelength. 


What is the grid spacing that needs to be implemented? 

Answer: With 20 points per dominant wavelength we obtain dx = Agom/20 = 
11.25 m grid spacing. 

What is the size of the physical domain and how many overal grid points 
are required? 

Answer: The spatial extent of the 2D model is 10km x 10km. In each di- 
mension we thus need 10,000 m/dx ~ 900 grid points, leading to 900? 
overall grid points. 


Will the seismogram(s) be influenced by artificial boundary reflections (i.e., 
is it necessary to implement absorbing boundaries or increase the model 
size)? 

Answer: With the model set-up as chosen—the fault zone in the middle of 
the model—we might be able to avoid reflections from the domain bound- 
aries, when seismograms are extracted only at the centre of the domain 
(above the fault zone). For receivers near the boundaries or very long time 
series, absorbing boundaries would be necessary. 


What is the maximum velocity in the model, and the resulting time step 
(given the grid increment and the CFL criterion)? 

Answer: The maximum velocity in the model is Cngx = 3,000 m/s. Assuming 
a CFL value of € = 0.7 we can determine the time step required for a stable 
simulation as dt = €dx/Cmax = 0.0026 s. 
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Fig. 4.15 Landers, California. View 
East along a road crossing the epicentral 
area of the M7.3 Landers earthquake 
in 1992 with a horizontal surface 


slip of several metres. In many places 
trapped waves can be observed right 
above fault zones. Trapped waves were 
extensively investigated with a finite- 
difference method by Jahnke et al. 
(2002). 
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Fig. 4.16 Snapshot of pressure amplh- 
tude for a source located at the (left) 
edge of a fault zone model (limited by 
vertical lines) with a 25% velocity de- 
crease. The vertical extension (equivalent 
to the source depth) is Skm. The snap- 
shot indicates the occurrence of head 
waves (e.g. top left) and the develop- 
ment of a dispersive wavefield trapped 
inside the low-velocity zone (t.e., fault- 
zone trapped waves). Such phenomena 
are now widely observed above faults in 
seismically active areas with high defor- 
mation rates (e.g. San Andreas Fault, 
North Anatolian Fault). 


e What is the overall number of time steps to be propagated? 
Answer: For a desired simulation time of t,,,, = 3.5s the number of time 
steps required 18 tyygx/dt © 1,300. 


e How much core memory (RAM) will the simulation approximately require? 
Answer: For a simple estimate we focus on the space-dependent fields 
that will constitute the largest part of the memory allocation. Those 
fields are: (1) the velocity model, (2) the pressure field at three differ- 
ent time levels, and (3) two temporary fields containing the second space 
derivatives. Assuming double precision floating point numbers (8 bytes 
per number) this will require approximately 6 x 900? x 8bytes ~ 
40 MBytes. So you should be able to run this simulation easily on your 
smartphone! 


Results for a simulation with these parameters are shown in Fig. 4.16. The snap- 
shot at a simulation time of t = 2s indicates a highly focused wavefield propagating 
inside the fault zone towards the surface. The earthquake source is located at the 
left side of the low-velocity zone at 5km depth. In the host medium (high ve- 
locity) head waves develop at the edges of the fault zone. Within the fault zone 
there are delayed and amplified fault-zone (trapped) waves propagating towards 
the surface. In any case this simple but quite realistic structural heterogeneity has 
a dramatic effect on the wave field recorded at the surface. The short scale high 
amplitudes measured directly above the fault zone have considerable relevance 
with respect to the shaking hazard in seismically active regions, but are usually 
ignored. 

For many wave-propagation problems the questions just raised (usually of 
course for 3D problems) should be asked as part of planning a research project 
involving simulations. You want to be realistic in terms of what specifically is fea- 
sible in terms of memory and CPU time available to you. Do you want to carry 
out a few extremely highly resolved simulations or are you targeting tens of thou- 
sands of low-resolution simulations (e.g. for inverse problems or parameter space 
studies)? Answers to these questions are also required when applying for large 
computational resources at the supercomputer centres. 

There are several issues that we have not addressed here, mainly because they 
go beyond this introductory level. Some examples are: (1) What rheologies are 
necessary for your specific problem? (2) What is the degree of heterogeneity (do 
you need to use parameter-averaging schemes)? (3) What is the wavenumber 
spectrum of your Earth model and which numerical model suits best (e.g. strongly 
heterogeneous random models with short-wavelength structures might be less 
suitable for spectral methods)? (4) Will your problem require parallel implemen- 
tation? (5) How are maximum/minimum propagation velocities related? Does it 
make sense to have a uniform grid or do you need to have space-dependent grid 
increments (or variable operator accuracy in the medium)? Some of these issues 
will be discussed when presenting research problems with the various methods at 
the end of this volume. 


4.6 Elastic wave propagation in 1D 


Let us move towards a physical description of waves that is closer to seismic wave 
propagation. In Part I we introduced the stress-strain relation as oj = A€pedz + 
2e, Where A, jz are the Lamé parameters and oj and €; are the stress and strain 
tensors, respectively. When considering 1D wave propagation in x-direction and 
particle motion in a horizontal direction y this relation reduces to 


Oxy = Oye = Zikegp = 2h Oath + dy Ue) = Mdytiys (4.56) 


=0 


where uw, is the only non-zero horizontal displacement component (and o,, cor- 
respondingly the only non-zero stress component). The propagation of elastic 
waves on strings can be described with this 1D relation. Despite the fact that 
we are dealing again with a scalar wave equation it will allow us to introduce the 
concept of grid-staggering that is a key element in many 3D finite-difference al- 
gorithms used in research today. But first let us apply the concepts of central finite 
differences to the resulting wave equation. 


4.6.1 Displacement formulation 


The 1D stress-strain relation presented in Eq. 4.56 leads to the 1D elastic wave 
equation in which density p(x) and shear modulus j1(x) are allowed to vary in 
space (under some smoothness constraints) 


pau = dy(w du) +f, (4.57) 


where the space-time dependence of unknown field uw and force f is implicitly 
assumed. This is called the displacement formulation. The spatial discretiza- 
tion around an arbitrary grid point 7 located inside the medium is illustrated in 
Fig. 4.17. All space-dependent fields are defined at these regular-spaced grid 
locations at time level 7 through 


u. = u(idx, jdt). (4.58) 


We proceed by using the definition of centred finite differences replacing the 
partial differentiation in space and time. 
In space we obtain the derivative of the displacement w’ with reference to space 
coordinate x as (omitting the superscript for time) 
Uj+1 — Ui-1 


(= 0 = ‘ 4.59 
u u abe ( ) 


where the derivative is defined at location 7. Note that the function value uw; is 
not used for the evaluation of the derivative because of the anti-symmetry of the 
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Fig. 4.17 Spatial discretization and 
indexing in 1D. 
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difference operator. To obtain the first term of the right-hand side of Eq. 4.57 we 
multiply by the shear modulus jz defined at location 7. We indicate this locality 
with a vertical bar and subscript 7 


Uj+1 — Uj-1 


dul; =u; = wi 4.60 
Loxu|; =pwu'|; = p Sais (4.60) 


The next step is to take the derivative of this term by evaluating the difference at 
positions 7+ 1 and 7-1, again using a central difference formula, to finally obtain 


_ pu int — pu | +1 


Ix (HOU) | i are 
Mitt Uig2-Ui) Mic Ui-Ui-2) 
= de dee (4.61) 
2dx 
Misi Uin2 — Misi — Miri + Wi-1Ui-2 


Adx? 


for the stresses at grid point 7. 
Approximating the left-hand side of the wave equation with a centred scheme 
for the second time derivative at time level 7 for grid point 7 leads to 


ut 2 tu) misty — Win l — piu, + Wir 
de Adx? (4.62) 
eis 


Pi 


and the final extrapolation scheme for the displacement-stress 1D elastic wave 
equation using a central difference scheme 


; dt? ; : : : 
uf? ~ Apae Min Ugg — Mini My — Mir, + Minn ui2| 
¢ s (4.63) 
+e) ult + 
Pi 


In this formulation we calculate directly derivatives of the elastic parameters 
(that might be discontinuous in the Earth). Also, we employ a central difference 
scheme with a 2dx grid spacing ignoring the information of the field at the central 
location. This situation can be improved by solving another form of the wave 
equation and using the staggered-grid concept. While instructive for introducing 
finite-differences to wave equations, the centered schemes in this formulation are 
inefficient for realistic problems. 


4.6.2 Velocity-stress formulation 


As indicated at the beginning of this chapter, the error of a finite-difference ap- 
proximation depends on the size of the increment dx employed in the derivative 
approximation. In the present scheme the leading error is a quadratic function of 


dx. If we were able to reduce the size of this increment by a factor of 2 the error 
of the scheme would be four times smaller. Defining 


Uv 


O,u 
(4.64) 


oO [L0,U, 


where v is velocity, 0 = oy, = 0), representing the only non-zero stress compo- 
nent, and implicitly assuming space-time dependencies leads to the wave equation 
as a coupled system of two first-order partial differential equations 


P0.v n.0 +f 


(4.65) 
0,0 = [L0xv. 


Note that we are not taking second derivatives any more, nor are we directly 
calculating derivatives of the material parameters. Our unknowns are the discrete 
velocity and stress values 


v. = v(idx, jdt), (4.66) 


again defined on a regular spaced grid in time and space. We proceed by replacing 
the partial differentials with centred finite-difference approximations to the first 
derivative. However, these are not defined at the grid points of the function but 
in between (i.e., 7+ 3). Remember that the difference operator is antisymmetric, 
which implies that the information at the location of the derivative is not used any- 
way. The grid staggering is illustrated in Fig. 4.18.? The following computational 
scheme does the trick: 


<d 
It+5 3 oO —-O ¥j 
Bp My 2 Age FG 
dt dx 
es (4.67) 
git! _gi ak ail 
elds Of Mh J+Z Jt? 
a) 9) bb jou 7U; 
dt ity dx 
leading to the extrapolation scheme 
Jj j 
o ,- 
. 1 1 1 
It. dt i+5 -5 m5 dt 
u,? 2 2 | v2 +—f 
Pi dx Pi 
; - (4.68) 
It 2 
eo = dt ju. : i+1 z +g! 4 
+= i+ fa 
+5 dx +9 


First of all, we note that the space-dependent properties of the (Earth) model p, 
and yw are not defined at the same locations. Also, stress and velocity are stag- 
gered in space and time. Yet, the scheme is consistent in the sense that we 
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Fig. 4.18 Spatio-temporal grid stagger- 
ing in 1D. The vertical indexing corre- 
sponds to time and the horizontal index- 
ing to space. 


° For any staggered finite-difference 
scheme it is useful to draw the spatial 
scheme in this way in order to make sure 
the indexes are properly addressed in the 
computer program. On first encounter it 
may take a while to digest these schemes. 
Please note that the definition of indexing 
is not unique. The results are the same but 
it has to be consistent. 
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Table 4.2 Simulation parameters for 
1D velocity-stress simulation 


Parameter Value 

p 2,500 kg/m? 
a 5 x 50 GPa 
Us 4,500 m/s 
Kee 1,000 km 
dx 1,000 m 

dt 0.18s 

to 1/15 Hz 


10 Remember the displacement-stress 
formulation requires three time levels to be 
evaluated and stored. 


are always multiplying or adding terms that are defined at the same location 
or time level. This is an inherent property of staggered elastic finite-difference 
schemes, with some exceptions (e.g. when the medium is anisotropic or wave 
equations in other coordinate systems are being solved). In the following, we 
will present a simulation example using this scheme and investigate its dispersion 
characteristics. 


4.6.3 Velocity-stress algorithm: example 


Let us illustrate the velocity—stress algorithm with an example. We initialize the 
homogeneous model with the parameters given in Table 5.3. The example has di- 
mensions encountered in regional seismology with physical domain of 1,000 km, 
a shear-wave-propagation velocity of 4,500 m/s (representative of an upper man- 
tle shear velocity or a near-surface Love wave velocity), and a dominant period 
of 15s. The Python code fragment shown below represents the core of the 1D 
velocity—stress finite-difference algorithm. 


# Time extrapolation 
for it in range(nt): 
# Stress derivative 
for i in range(1, nx-1): 
ds{i] = (s[it+1] - s[il])/dx 
# Velocity extrapolation 
v =v + dt/rhords 
# Add source term at isx 
v[isx] = v[isx] + dt*src[it] / (dx*rho [isx] ) 
# Velocity derivative 
for i in range(1, nx-1): 
dv[i] = (v[i] - v[i-1])/dx 
# Stress extrapolation 
s = s + dt*murdv 


In this code fragment, which presents the time extrapolation loop, zt is the time 
level, and s, v are respectively unknown stress, and velocity values. Note that 
velocity and stress, as well as the given Earth model parameters (~, 2) are vectors 
(allowing heterogeneous models to be initialized). The source is injected as a force 
term at grid point is. We stress here that code lines like the velocity extrapolation 
are not mathematical statements. In this case the updated velocity field is allocated 
to the same vector (thus overwritten) as only two time levels are required in this 
formulation.!° In the computer code one does not work with index fractions as 
in the mathematical algorithm. The spatial loops avoid the points beyond the 
boundaries. Boundary conditions are briefly discussed at the end of this chapter. 

Results of the simulation are shown in Fig. 4.19. In the case of a homoge- 
neous medium, we expect theoretically the source wavelet (the first derivative of 
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a Gaussian) to propagate undisturbed for ever. Of course this is not the case 
in our numerical world. We observe, as in the acoustic case, that the waveform 
breaks down as a consequence of numerical dispersion. Let us quantify again this 
dispersion behaviour for this frequently—used elastic grid-staggering scheme and 
discuss the consequences. 


4.6.4 Velocity-stress: dispersion 


Applying the same procedure as given in the section on numerical dispersion of 
the acoustic problem above we obtain the condition for stable calculations 


. (wdt fedt . (kdx 
sin = + sin 5 
2 p ax 2 


where the physical parameters jz and ¢ are assumed constant. Note that this is the 


(4.69) 


same relation as obtained in the acoustic case. 
For the numerical phase velocity as a function of wavenumber (wavelength, or 
frequency, or number of points per wavelength) we obtain 
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Fig. 4.19 Simulation with 
the velocity—stress formulation. Seismo- 


example 


grams are shown at various distances 
from the source for a dominant period 
of 15 s. The numerical results (solid 
line) are compared with the analytical 
solution (dashed line). Note the increas- 
ing difference between analytical and 
numerical solution with propagating 
distance. 
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Fig. 4.20 Bottom: Dispersion curves 
for the 1D velocity-stress staggered-grid 
finite-difference scheme as a function 
of number of grid points per spatial 
wavelength. Phase and group velocity 
are given as solid and dashed lies, 
respectively. The normalized spectrum 
of the source time function (simula- 
tion example with dominant frequency 
fo =1/15 Hz ts superimposed. Top: De- 
tail towards the short-wavelength end of 
the spectrum. 


num 
¢ = 


(4.70) 


wo Ay dt . mdx 
— = ——sin™ [(co— sin — ], 
k a dt dx Xr 


where k = 277/d was used. Energy propagates with group velocity, so the accuracy 
of simulations should be checked against group velocity c,, defined as 


co cos 2 


x r: (4.71) 
[1 - (co # sin xg)? 


These relations allow us now to investigate the behaviour of the propagation 


velocities as a function of frequency. It is instructive to translate the frequency 
w/k and 
the corresponding dx of our simulation example. The results are illustrated in 
Fig. 4.20. For any regularly spaced field the Nyquist wavenumber corresponds to 


axis into the number of grid points per spatial wavelength using c = 


2 points per wavelength (the left end of the horizontal axis). As can be seen in 
the figure the phase, and even more so the group velocity, substantially deviate 
from the theoretical propagation velocity as the number of grid points becomes 
smaller. 

It is instructive to superimpose the spectrum of the source time function (a 
first derivative of a Gaussian) and translate it into the same coordinate system 
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(amplitude normalized). For the simulation example discussed in the previous 
section there is substantial energy in the pulse between 20 and 300 grid points 
per wavelength while the dominant frequency is sampled with ~66 points. Given 
these relatively large numbers it may appear surprising how bad the results are. 
However, an important aspect to consider is always the propagation distance. 
Also, we used here the lowest-order finite-difference operator. The results can be 
substantially improved using a four-point operator for the derivative calculations. 


4.7 Elastic wave propagation in 2D 


4.7.1 Grid staggering 


Despite the focus of this volume on the presentation of various numerical 
schemes in the 1D case, we present here some basic aspects of grid staggering 
in 2D because (1) it contains the fundamental aspects of grid staggering for 
the stress-strain relation for higher dimensions, and (2) the extension to 3D is 
straightforward. Furthermore, this scheme is one of the most widely used nu- 
merical approximations in seismological research today. Here we only discuss 
the stress-strain relation. The complete 2D or 3D staggered grid algorithm is 
presented in other books or review papers (see suggestions at the end of this 
chapter). 
Let us recall the (time-derivative of the) stress-strain relation 


0,0; = AD, €RR Oy of 20,€; (4.72) 


and rewrite it in 2D using the definition of the strain tensor to obtain for each 
component 


O;Oxx = (A Tr 2) OxVx + LOZVz 
(A + 211) 0202 + 10,Vx (4.73) 


Oz 


0,0 x2 = [L(O,Vz + OzVx)5 


where v, and v, are the two components of the velocity vector. The first deriva- 
tives of the velocity field with respect to both spatial coordinates need to be 
evaluated. Following the concepts described above, this can be achieved using 
the staggering scheme presented in Fig. 4.21. 

Here, the diagonal elements of the stress tensor are defined at the same loca- 
tions, while both velocity components and the off-diagonal stress component are 
at staggered locations, shifted along the coordinate axes. This leads to a consistent 
scheme in which the first derivatives only need to be calculated at these staggered 
grid locations. However, note that to evaluate the stresses and extrapolate the ve- 
locity field, the physical parameters p, A, j4 have to be known at different locations 
inside one grid cell. In the case of a heterogeneous medium, this has to be taken 
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Fig. 4.21 Spatial grid staggering for the 
2D velocity—stress formulation. Dots de- 
note time derivatives. 
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Fig. 4.22 Stress imaging. The stress- 
free boundary condition can _ be 
implemented numerically by adding 
an artificial domain outside the medium 
and applying symmetry conditions to 
stresses and velocity (see text for detatls). 


into account and, unless the variations of the parameters are known analytically, 
the values will have to be interpolated to the staggered grid locations. 


4.7.2. Free-surface boundary condition 


For simulation problems involving the Earth’s surface, the implementation of the 
free-surface boundary condition is crucial. As discussed in the introductory sec- 
tions on wave propagation, assuming 2 as the vertical direction, the traction at the 
surface is zero, which implies that stress components 


,=0 (4.74) 


vanish. It is instructive to reduce this problem to 2D and demonstrate how it can 
be implemented in a basic way. This approach goes back to Levander (1988). 
From the stress-strain relation we know that 

0,Oz2 = 0 = 00, Vy + (A + 211) 0,02 


(4.75) 
0;0x2 = 0 = [M(dxVz + OzVx). 


The procedure is illustrated in Fig. 4.22. The medium is extended beyond (above) 
the interior domain for as many points as required by the length of the finite- 
difference operator. One index level (here 7) is defined as the location of the 
free surface. Here, we impose the stresses to be zero. If the stresses are extended 
beyond the free surface in an anti-symmetric way, the stress-free condition is 
fulfilled. The velocities are imposed to be symmetric such that the vertical gradi- 
ents vanish. The horizontal derivatives in the above equation can be calculated as 
usual. This implementation is not unique. There are other options, which were 
discussed in Gottschémmer and Olsen (2001). 

This approach is a low-order implementation of the free surface which often 
is not accurate enough when dealing with surface waves that propagate many 
wavelengths. Finding more accurate solutions for the free-surface problem led 
to alternative strategies such as one-sided approximations (Kristek et al., 2002; 
Moczo et al., 2004) and hybrid solutions, exploiting the fact that in finite-element 
simulations the free-surface boundary condition is implicitly fulfilled (Galis et al., 
2008). These schemes are presented in detail and compared with each other in 
Moczo et al. (2014). 
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4.8 The road to 3D 


The extension of a 2D acoustic finite-difference scheme as presented here to 3D 
is very easy: just add another dimension to all fields. This is a useful exercise (e.g. 
using the 2D schemes presented in the supplementary material) and 3D acoustic 
simulations can be run on PCs providing the models are not too large. A review 
of 3D acoustic finite-difference methods was presented by Etgen and O’Brien 
(2007). 

The extension of staggered-grid schemes is also straightforward, but requires 
care concerning the proper implementation of differential operators and bound- 
ary conditions. There are many papers that present entire algorithms that can 
be used to develop 3D codes. Examples are Graves (1996), Igel et al. (1995) 
(anisotropic case), Pitarka (1999) introducing heterogeneous grids, and Kristek 
and Moczo (2003) the viscoelastic case. The book by Moczo et al. (2014) con- 
tains most conceivable finite-difference schemes in detail and is the best source 
for developers. The classic 3D staggered grid is shown in Fig. 4.23. 

The following sections aim at winding you down (or hyping you up) with 
some interesting further aspects and developments that might raise your interest. 
Some of the topics covered here go beyond the introductory level but might be 
important for the implementation of competitive algorithms. 


4.8.1 High-order extrapolation schemes 


This volume has a clear focus on explaining the mathematical approaches of 
the various numerical techniques concerning the space-dependent discretization 
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dy 


dx 
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Fig. 4.23 3D staggered finite-difference 
grid (velocity stress). The off-diagonal 
stresses oy are separated by the diagonal 
elements oj; and the velocity components 
vy; by half a grid distance (dx, dy, dz). 
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of wave propagation. Therefore—with the exception of the finite-volume and 
the discontinuous Galerkin methods—the time extrapolation scheme is of low- 
est order. The accuracy of all schemes can be improved by applying high-order 
extensions. Most community software packages do not go beyond second-order 
extrapolation schemes. 

As a first example we illustrate the predictor—corrector scheme for a general 
first-order system (e.g. advection equation, velocity-stress formulation), which 
can be described by 


g(x, 1) = L(g, t) = c dxq(x ft) + s(x, 2); (4.76) 


where q is the solution field, c is velocity, and L is a linear operator. The first-order 
Euler scheme corresponds to 


gq’ = q(x,t+ dt) = q(x, t) + dtL(q,0), (4.77) 


which in many cases is not accurate enough. A far better scheme is the so-called 
predictor—corrector method (or Heun’s method; it also belongs to the family 
of Runge-Kutta methods) that is obtained from the Euler method using the 
trapezoidal rule. With the definitions above we obtain 


kl =L(qt) predictor 
k2 = L(q+ dtq*,t + dt) corrector (4.78) 


1 
q(x, t + dt) = q(x,t) + 5 Hl + k2), 


with marked improvement compared to the Euler scheme. We make use of this 
approach in the chapters on the finite-volume and the discontinuous Galerkin 
methods for implementation examples and results. 

As a second example we present a powerful concept that goes back to the 
work of Lax and Wendroff (1960), sometimes also referred to as the Cauchy— 
Kowaleski procedure. We start with the Taylor expansion of a space-time 
dependent function (e.g. acoustic wavefield) 


dt? 
q(x,t + dt) = q(x, t) + dtd,q(x, 1) + = 87 PCs +... 


Wag (4.79) 
dt .. 
=m a Gy oP t). 
j= ~ 


We know that q(x,t) is the solution to the advection problem (Eq. 4.76). Thus the 
following relation also holds 
a a(x, t) =c 0, [a q(xs ‘)] > (4.80) 


indicating that we can calculate time derivatives of q(x,t) of any order in a 
recursive way making use of the advection equation. In other words we replace 


time derivatives by space derivatives. This approach was used to develop the 
Arbitrary high-orDER (ADER) schemes for the finite-volume and discontin- 
uous Galerkin methods (e.g. Titarev and Toro 2002; Dumbser and Munz, 
2005a). This formulation also works for the second-order wave equation (e.g. 
Dablain 1986; Igel et al., 1995). Further solution schemes not discussed here are 
the Crank—Nicolson scheme, the Newmark scheme, and high-order Runge-Kutta 
schemes. 


4.8.2 Heterogeneous Earth models 


The finite-difference method belongs to the class of numerical approaches that 
is primarily used on regular equally spaced grids, multi-domain equally spaced 
grids, or space-tree-based solutions with local refinements. Therefore, in cases 
where interfaces (i.e. material discontinuities) are not aligned with the coordinate 
axes there is a problem (see Fig. 4.24 for an illustration). The geometry of the 
interfaces is not accurately modelled. In part this restriction motivated the devel- 
opment of other methods such as Galerkin-type approaches (e.g. finite/spectral 
elements) for seismic wave propagation. 

However, there are ways to improve the situation by modifying the elastic 
parameters and density around the interface, replacing them using appropriate 
averaging schemes. Muir et al. (1992) applied equivalent medium theory to this 
problem, replacing the isotropic parameters with anisotropic parameters. An al- 
ternative approach for smoothly varying media was presented by Moczo et al. 
(2002) using volume and arithmetic averaging of elastic moduli and density for 
isotropic media and for viscoelastic media Kristek and Moczo, 2003. 

Yet another strategy to cope with strongly heterogeneous media is to make 
the grid density space-dependent. Examples are presented in Jastram and Tess- 
mer (1994). More recent results are reported in Moczo et al. (2010a) comparing 
numerical solutions for an earthquake ground motion problem, indicating the 
benefits of discontinuous finite-difference grids. Other technical developments 
include the introduction of spatially varying time steps for very heterogeneous 
problems (Tessmer, 2000). The results presented by Moczo et al. (20100) indi- 
cate that care has to be taken when the ratio between P- and S-velocity exceeds 
certain values. 

A more general treatment of the strongly heterogeneous Earth models leads to 
the problem of homogenization (e.g. Capdeville et al., 2010a; Capdeville et al., 
20106), discussed in Chapter 11. 


4.8.3 Optimizing operators 


Many attempts have been made to develop better, optimal finite-difference op- 
erators specifically for the elastic wave-propagation problem, that go beyond the 
more general improvements of the time-extrapolation schemes described above. 
A very clever approach is to artificially make errors in the space derivatives that 
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Fig. 4.24 Internal interfaces. Interfaces 
that do not coincide with grid lines are 
not properly accounted for by regular 
spaced grids. There are several strategies 
to improve the situation, by assigning 
equivalent medium properties to the grid 
points adjacent to the interfaces (see text 
jor details). 
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Fig. 4.25 Optimal operators. The con- 
ventional second-order finite-difference 
operators for the second derivative are 
compared with the optimal operators de- 
veloped by Geller and Takeuchi (1998) 
(see text for details). 
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Fig. 4.26 Simulation with optimal op- 
erators. Synthetic seismograms for the 
1D acoustic case are compared. In this 
example the energy misfit with the ana- 
lytical solution for the classic five-point 
operator is 10.1% and for the optimal 
operator 1.2%. 


'! In many publications it is stated that 
nth order schemes are used. Often such 
statements refer to the accuracy in space 
only. This rarely translates to the overall 
accuracy of the space-time-dependent so- 
lutions. Only careful convergence analysis 
allows quantification of the true conver- 
gence order. 


Conventional (1/dt?) Optimal (1/dt2) 


t+dt 1 t+dt 1/12 10/12 1/12 
t —2 t 2/12 20/12 2/12 
t-dt 1 t-dt 1/12 10/12 1/12 
x-dx x x+dx x-dx x x+dx 
Conventional (1/dx?) Optimal (1/dx2) 
t+dt t+dt 1/12 -2/12 1/12 
i 1 2 1 t 10/12 | -20/12 | 10/12 
t-dt t-dt 1/12) | -2/12 
x-dx BS x+dx sake % 


compensate for the errors committed by the time extrapolation, to obtain a truly 
high-order scheme.!! This approach was taken, for example, by Emmerich and 
Korn (1987). 

Geller and Takeuchi (1995) developed criteria against which the accuracy of 
frequency-domain calculation of synthetic seismograms could be optimized. This 
approach was transferred to the time-domain finite-difference method for homo- 
geneous and heterogeneous schemes by Geller and Takeuchi (1998). Let us have 
a look at an optimal operator and compare with the classic scheme. The space— 
time stencils are illustrated in Fig. 4.25. Note that by summing up the optimal 
operators one obtains the conventional operators. This can be interpreted as a 
smearing out of the conventional operators in space and time. The optimal op- 
erators lead to a locally implicit scheme, as the future of the system at (x, t + dt) 
depends on values at time level ¢ + dt, that is, the future depends on the future. 
That sounds impossible, but it can be fixed by using a predictor—corrector scheme 
based on the first-order Born approximation. 

The optimal operators perform in a quite spectacular way. With very few extra 
floating point operations an accuracy improvement of almost an order of magni- 
tude can be obtained. The results shown in Fig. 4.26 were obtained by coding 
the algorithm presented by Geller and Takeuchi (1998). The optimal scheme 
performs substantially better than the conventional scheme with a five-point oper- 
ator. The details about the simulation set-up are not important here. The message 
is simple: While finite-difference methods often have a brute-force reputation it 
is important to note that a well-written finite-difference algorithm is highly com- 
petitive when compared to other high-order schemes such as the spectral element 
method. In the light of this it is surprising that the optimal operator concept does 
not seem to be widely used. 
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4.8.4 Minimal, triangular, unstructured grids 


Can the finite-difference method be applied to triangular, or unstructured grids? 
This question dominated my own research when working towards solutions for 
global wave propagation. Magnier et al. (1994) presented a very elegant algo- 
rithm for equilateral triangular grids (hexagonal structures). By interleaving two 
such grids a staggered-grid-type scheme was developed allowing the efficient so- 
lution of the elastic wave equation in 2D. Such grids have the neat property that 
the error is isotropic due to the hexagonal symmetry of the spatial grid. Unfortu- 
nately no such isotropic grid exists in 3D! Later it turned out that the difference 
weights derived for the minimal grids are essentially first-order finite-volume 
weights. 

But what happens when the triangles are not equilateral? Can we increase the 
influence domain and use more points for the derivative calculation like in the 
classic finite-difference method? These questions led to the search for general 
ways to calculate differential weights for unstructured grids. At around that time, 
Braun and Sambridge (1995) had imported the concepts of natural neighbour 
coordinates from computational geometry to geophysics. The fact that this paper 
appeared in Nature is a sign of how important this problem was considered by the 
scientific community. 

The differential operators proposed by Braun and Sambridge (1995) were 
investigated and applied to the 2D elastic wave-propagation problem by Kdaser 
et al. (2001) and Kaser and Igel (2001), with applications to spherical struc- 
tures, and media with curved internal boundaries or complicated free surfaces. 
An illustration of the natural neighbour concepts with Voronoi cells is shown in 
Fig. 4.27. The results with this approach indicated that the operators are not ac- 
curate enough for the elastic wave-propagation problem and not easily extended 
to high order. Eventually this frustration led to the adaptation of the more sophis- 
ticated discontinuous Galerkin method to the wave-propagation problem (Kdser 
and Dumbser, 2006). It is fair to say that, for problems involving unstructured 
tetrahedral grids, today this is the method of choice. 
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Fig. 4.27 Finite differences on unstruc- 
tured grids. Unstructured grids can 
be treated using Delauney triangula- 
Left: Akin 
to staggered-grid schemes, unstructured 


tion and Voronoi cells. 
primary grids (e.g. stresses) can be in- 
terleaved with a secondary grid (e.g. 
velocities), allowing the calculation of 
(first) derivates for a_ velocity—stress 
wave-propagation scheme. The differ- 
ential weights can be calculated using 
natural neighbour coordinates or finite- 
volume concepts. Right: Voronot cells for 
a spherical shape with grid densification 
near the boundary. 
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Fig. 4.28 Schematic illustration of 
multidomain axisymmetric mesh for 
staggered-grid finite-difference calcula- 
tions with grid refinement towards the 
surface. Note that, in addition to the 
grid densification due to the spherical 
grid, velocities increase with depth in the 
mantle. This leads to small time steps 
due to the CFL criterion. At the domain 
boundaries additional interpolations are 
necessary to calculate the space deriva- 
tives. From Igel and Gudmundsson 
(1997). Reprinted with permission. 


Fig. 4.29 SH waves in the Earth’s 
mantle. SH-wave propagation based on 


a staggered-grid finite-difference scheme 
for the spherically symmetric Earth 
model PREM. The source is at 600 km 
depth and the dominant period is 25s. 
Note the reflections from the core-mantle 
boundary, and the increasingly complex 
wavefield near the Earth’s surface from 
multiply reflected S-phases (building up 
Love surface waves). 


4.8.5 Other coordinate systems 


How about other coordinate systems? In fact, some of the first papers applying the 
finite-difference method to wave-propagation problems were using non-Cartesian 
formulations of the wave equation. Alterman and Karal (1968) used cylindrical 
coordinates, and the same group developed a scheme for spherical coordinates 
(Alterman et al., 1970) targeting global wave propagation. This was heroically 
forward looking, given the tiny computational resources that existed at the time. 

Because working in other coordinate systems is still very interesting today, let 
us have a quick look at the acoustic wave equation in spherical coordinates (the 
same concepts apply to cylindrical coordinates). Assuming a standard spherical 
coordinate system r,9,@ and invariance in @ (so-called zonal or axisymmetric 
model) we can write the wave equation using the Laplace operator as 


1 
a2p=c? | 80a.) + in(sin 040) +5, (4.81) 


r? sin 0 
where p(r,@) is pressure, c(r,@) is acoustic velocity, and s(r,@) is a source term. 

This looks much more complicated than the Cartesian case, and for the finite- 
difference-based numerical solution there are dramatic consequences. First, the 
equation contains singularities and is not defined along the axis 6 = 0. Second, 
regular discretization along the coordinate axes r and @ leads to increasing grid 
point distance for increasing radius. This is illustrated in Fig. 4.28. For global 
wave propagation this is the opposite of what we need. In the mantle, seismic 
velocities increase with depth; therefore we would prefer to increase grid spacing 
with depth. 

Nevertheless, it is possible to devise staggered-grid finite-difference schemes 
for this case (Igel and Weber, 1995, 1996; Chaljub and ‘Tarantola, 1997) that 
led to some of the first numerical investigations of scattering effects in laterally 
heterogeneous mantle structure (e.g. Toyokuni and ‘Takenaka 2006; Jahnke et al., 


2008; Thorne et al., 20136). An example of some of the first simulations is shown 
in Fig. 4.29 for global SH-wave calculations. At that time the visualization of such 
wavefields, despite their limited scientific value, allowed an unprecedented view 
of what happens when waves pass through the Earth’s interior. Even today I find 
these visualizations fascinating. They have enormous educational value; a fact 
that was recently recognized by Thorne et al. (2013a), who provided numerous 
openly accessible animations. '? 

In principle 3D wave propagation in spherical coordinates (spherical sections 
excluding poles) using finite differences is possible. Igel et al. (2002) presented 
an anisotropic scheme and used the method to investigate wave effects of sources 
located inside subduction zones. An illustration of this approach is shown in 
Fig. 4.30. 


4.8.6 Concluding remarks 


In its simplest form the finite-difference method offers a straightforward intro- 
duction to the world of numerical methods. For many physics problems a first 
approximate solution can be obtained relatively quickly, and it can help us to 
develop a deeper understanding of the necessary properties of any solution. 
Therefore it plays an important role for the development of simulation methods 
in many fields of science. 

The finite-difference method continues to play an important role in Earth sci- 
ence today. As indicated, with relatively minor extensions to the basic solution 
schemes, finite-difference simulations can be very competitive with other meth- 
ods, in particular for problems concerning strong ground motion, local scale wave 
propagation, and seismic exploration problems (see Chapter 10 on applications 
for specific examples). 


Chapter summary 


e Replacing the partial derivatives by finite differences allows partial differ- 
ential equations such as the wave equation to be solved directly for (in 
principle) arbitrarily heterogeneous media. 


Chapter summary 109 


Fig. 4.30 Wave propagation in spheri- 
cal sections using finite differences. The 
singularities in the wave equation for- 
mulated in spherical coordinates can 
be avoided by appropriate positioning 
in the spherical domain (e.g. around 
the equator). Snapshot of elastic wave 
propagation for an explosive source at 
600 km depth. Note the P-S conversions 
at the core-mantle boundary. From Igel 
et al. (2002). Reprinted by permission. 


12 http://web.utah.edu/thorne/animations. 
html. 
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The resulting space-time discretization leads to unphysical phenomena 
such as numerical dispersion that can only be avoided by sampling with 
enough grid points per wavelength. 


The accuracy of finite-difference operators can be improved by increasing 
the number of grid points (i.e. longer operators) used to approximate the 
derivatives. The weights for the grid points can be obtained using ‘Taylor 
series (or spectral methods). 


Plane-wave (or von Neumann) analysis of the approximative scheme 
leads to a stability criterion that restricts the choice of the space-time 
discretization. 


In 2D and 3D the error of wave propagation becomes anisotropic. In 
regular-spaced grids the most accurate direction is at 45° to the grid axes. 


The finite-difference method—despite usually low-order 
implementations—remains an attractive numerical scheme for many 
applications in seismology, even for problems that require accurate 
surface waves, provided that the free surface gets special treatment. 


In principle finite-difference-type operators are possible on unstructured 
grids but only with low-order accuracy. 


Finite-difference approximations to the wave equation in cylindrical or 
spherical coordinates are possible, with restrictions due to the intrinsic 
singularities. 


FURTHER READING 


For the finite-difference method, some good references are: 


e The recent book by Moczo et al. (2014) is the most complete work on 


the finite-difference method applied to elastic wave propagation to date. It 
provides many different algorithms and discusses pros and cons of vari- 
ous implementation strategies. There are also detailed sections on different 
rheologies (viscoelasticity) and boundary conditions (free surface, internal 
material interfaces, absorbing boundaries). 


The book by Fichtner (2010) on modelling and inversion of seismic 
waves has a section on finite differences and spectral elements with some 
additional features such as spherical coordinates. 

The Society of Exploration Geophysicists (SEG) has published two spe- 
cial volumes (Kelly and Marfurt 1990; Robertsson et al., 2012) with 
many classic papers on the various numerical methods applied to wave 
propagation. 


e A more mathematical treatment of finite-difference (and other) methods 


applied to wave-propagation problems can be found in Durran (1999). 


EXERCISES 


Comprehension questions 


(4.1) 
(4.2) 
(4.3) 
(4.4) 
(4.5) 
(4.6) 


(4.7) 


(4.8) 


(4.9) 
(4.10) 


(4.11) 


(4.12) 


Characterize problems that necessitate the use of numerical methods 
such as finite differences. 

Are finite-difference-based approximations of partial-differential equa- 
tions unique (give arguments) ? 

What strategies are there to improve the accuracy of finite-difference 
derivatives? Give the procedures in words. 

What is stability in connection with finite-difference algorithms. Give the 
relevant condition for the 1D wave-propagation problem. 

What is convergence? 

What is the difference between physical and numerical dispersion? 
Which propagation direction is most accurate on a rectangular (square) 
grid? Can you suggest any reasons why this might be so? 

Give strategies to check whether a finite-difference simulation is accu- 
rate for (a) a homogeneous medium, and (b) a strongly heterogeneous 
medium. 

Explain why staggered grids appear to be useful for the elastic wave 
equation. 

What is the difference between phase- and group velocity? To be on the 
safe side, which velocity should be accurately modelled, and why? 

Are finite-difference methods easily parallelized using domain decompo- 
sition? Do processors need to communicate with each other? Make an 
illustration for a 2D problem. 

Explain why for Earth models with large variations in seismic velocities, 
varying the grid cell size is highly desirable. What is the problem with 
having to have a global time step dt, though (i.e. one dt for all grid cells)? 


Theoretical problems 


(4.13) 


Show that 


f(x + dx) -— 2f (x) + f(x- dx) 
dt? 


is an approximation for the second derivative of f(x) with respect to x at 


position x. Hint: Use Taylor series 
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(4.14) 


(4.15) 


(4.16) 


(4.17) 


(4.18) 


(4.19) 


(4.20) 


where f (x) is the nth derivative of f(x). What is the leading order of the 
error term? 

Derive the numerical dispersion equation (Eq. 4.44) for the 1D acoustic 
wave equation using the von Neumann analysis. 

Use Taylor’s theorem to approximate the derivative of f(x) with the func- 
tional values given by f(x + dx/2) and f(x — dx/2). What is the order of 
accuracy? You are not happy with this accuracy and would like to have a 
higher-order approximation. Calculate the derivative weights, if you also 
use information at points x + 3/2dx and x -—3/2dx. 

Generalize the procedure of the previous exercise and derive equations 
for the system matrix A (Eq. 4.16) for centred and staggered-grid finite 
difference operators of arbitrary length. 

You want to estimate the derivative of a function f(x) near a boundary 
using the high-order finite-difference method. Develop the required sys- 
tem matrix and calculate the one-sided derivative weights for operators 
of arbitrary length. Hint: Define the derivative at f(« + dx/2) and search 
for weights at f(x) and f(x + ndx). Discuss the results. 

The source-free advection equation is given by 


0,u(x, t) 7 Vd.U(X; t), 


where u(x,t = 0) could be a displacement waveform at ¢ = O (an initial 
condition) that is advected with velocity v (this will become important 
in Chapters 8 and 9 on finite volumes and the discontinuous Galerkin 
method, respectively). Replace the partial derivatives by finite differences. 
Which approach do you expect to work best? Turn it into a programming 
exercise and write a simple finite-difference code and play around with 
different schemes (centred vs. non-centred finite differences). What do 
you observe? 

A seismometer consists of a spring with damping parameter ¢€, and eigen- 
frequency wy. The seismometer is excited by the (given) ground motion 
u(t). The relative motion of the seismometer mass x(t) is governed by the 
following equation 


X+ 2ek + wex = i. 


Replace the derivatives on the left-hand side with finite differences. Solve 
for x(t + dt). Note: a good strategy in this example is to centre the 
differences at the same point in time. The dots denote time derivative. 

Certain isotopes (e.g. 9Be) are washed into the sea by rivers and then 
mixed by advection through ocean currents and diffusion. In addition, 
the isotopes are removed from the system through biomechanical pro- 
cesses (e.g. death). These processes can be described by the diffusion— 
advection-reaction equation (concentration C(x, ¢), diffusivity k (const), 


(4.21) 


(4.22) 


(4.23) 


(4.24) 


(4.25) 


(4.26) 


(4.27) 


reactivity R(x), source p(x), advection velocity v). Substitute in the 
1D equation below the partial differentials with finite differences and 
extrapolate to C(t + dt): 


4,C = k82C + vd,C—RC + p. 


How could a ring current be simulated with this 1D equation mim- 
icking an oceanic gyre? What do you think is the best choice for the 
finite-difference formulation and why? 

‘You want to simulate 2D acoustic wave propagation in a medium with 
size 1,000km x 1,000km. You want to model wave propagation up to 
a period of 10 s. The maximum velocity c is 8 km/s, the minimum ve- 
locity is 4km/s. Your numerical algorithm requires 20 grid points per 
wavelength to be accurate for the propagation distances of interest. What 
space increment dx do you need for the simulation? The stability criterion 
says that maximum velocity c, space increment dx and time increment dt 
are related by € = cdt/dx. You want a seismogram length of 500s. How 
many time steps do you have to simulate, when € = 0.5? 

Show that when setting the Courant criterion to « = 1 for the homo- 
geneous acoustic problem with constant dt and dx (in other words the 
physical velocity c = dx/dt) there is no numerical dispersion. Hint: Make 
use of equation Eq. 4.45. What is the relevance for practical applications? 
Choose an appropriately tight formulation for the discretized fields (see 
examples in the text) and write down the finite-difference extrapolation 
scheme for the 3D acoustic wave equation (A is the Laplace operator) 


02 p(%5.95 =) t) = C(X5 Vs 2)°Ap(x,9 2) ay S(X5 95 a) t). 


Show that the Nyquist wavenumber corresponds to 2 grid increments per 
wavelength. 

Following the von Neumann analysis based on plane waves in the text 
calculate the stability limit G.e. CFL criterion) for the 3D acoustic wave 
equation (see previous exercise). 

Following the developments in the section on staggered grids, write down 
the second-order 3D elastic wave equation in the displacement formu- 
lation, as well as the stress-strain relation, and the strain—displacement 
relation. Find an appropriate 3D staggered-finite-difference cell where 
the derivatives are calculated in between the functional values (e.g. strain 
components as the derivatives of displacement components). 

‘You want to simulate global wave propagation. The highest frequencies 
that we observe for global wave fields is 1 Hz. Let us for simplicity assume 
a homogeneous Earth. The P velocity v, = 10 km/s and the v,/v, ratio is 
3. Let us assume 20 grid points per wavelength. How many grid cells 
would you need (assume cubic cells)? What would their size be? Now 
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(4.28) 


let us be more realistic. The maximum P-velocity in the Earth is 14 km/s 
and the smallest P-velocity is 1.5km/s in the oceans, or 5km/s in the 
crust. Assume that you can only have one grid size for the whole Earth. 
Estimate the number of cells, their size, and the required time step. The 
CFL criterion € = 0.5. 

The strain—displacement relation is given by 


1 
ej = zy (Oi + Ojui). 


Write down this relation in 2D. Allocate strain and displacement com- 
ponents to the four symbols such that there is a consistent scheme for a 
finite-difference method with a two-point operator for the first derivative. 
The central square corresponds to elements 77 (e.g. x > 23 y > 7). Is the 
mapping unique? 


e a e A 
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e a 
* / * . 


Programming exercises 


For the following exercises, please also refer to the supplementary electronic 


4.29) 


4.30) 


4.31) 


(4.32) 


material. 

Write a computer program for the 1D (2D) acoustic wave equation fol- 
lowing the equations presented in this chapter. Implement the analytical 
solution (see Chapter 2) and try to match it with appropriate parameters. 
Determine numerically the stability limit of 1D and 2D implementations 
of the acoustic wave equation as accurately as possible by varying the 
stability criterion. 

Increase the dominant frequency of the wavefield in 2D. Investigate 
the behaviour of the wavefield as a function of azimuth. Why does the 
wavefield look anisotropic? Which direction is the most accurate and 
why? 

Extend the (1D and/or 2D) codes by adding the option to use a five- 
point operator for the second derivative. Compare simulations with the 
three-point and five-point operators. Is the stability limit still the same? 
Make it an option to change between three-point and five-point opera- 
tors. Estimate the number of points per wavelength you are using and 


(4.33) 


(4.34) 


(4.35) 


(4.36) 


(4.37) 


investigate the accuracy of the simulation by looking for signs of numer- 
ical dispersion in the resulting seismograms. The five-point weights are: 
[-1/12, 4/3,-5/2, 4/3,-1/12]/dx?. 

Modify the program such that at the end of the calculation you can 
visualize and output synthetic seismograms. 

Modify the 2D velocity model c in ac2d and observe and discuss the 
resulting wavefield. (1) Add a low- (high-) velocity layer near the surface. 
Inject the source in this layer. (2) Add a vertical low-velocity zone (fault- 
zone) of a certain width (e.g. 10 grid points), and discuss the resulting 
wavefield (fault-zone trapped waves). (3) Simulate topography by setting 
the pressure to 0 above the surface. Use a Gaussian hill shape or a random 
topography. 

Use a spike source time function and look at the resulting seismogram. 
Examine the spectrum of this Green’s function. Do you spot the numeri- 
cal noise? Convolve the resulting seismograms with an appropriate source 
time function (e.g. a Gaussian of appropriate length). What happens with 
the numerical noise? 

Source-receiver reciprocity. Initialize a strongly heterogeneous 2D veloc- 
ity model of your choice and simulate waves propagating from an internal 
source point (x;, 2;) to an internal receiver (x,, 2,). Show that by reversing 
source and receiver you obtain the same seismograms. 

Time reversal. Define a source at the centre of the domain in an arbitrary 
2D velocity model and a receiver circle at an appropriate distance around 
the source. Simulate a wavefield, record it at the receiver ring, and store 
the results. Reverse the synthetic seismograms and inject them as sources 
at the receiver points. What happens? Can you explain the results? 
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1 Why pseudo? Well, pseudo means 
something like sort of but not really and in 
the context of spectral and pseudospectral 
has a specific significance. With spectral 
methods the equations are expressed and 
solved in the spectral domain, but with the 
pseudospectral method the spectral do- 
main is used merely for the calculation 
of spatial derivatives—the equations are 
space-time formulations. 


The Pseudospectral Method 


In terms of chronological order the pseudospectral! method was the first method 
that followed finite differences and was used extensively in several seismological 
research problems. We have seen in the previous chapter that the approximate 
spatial derivatives lead to quite dramatic problems when waves propagate over 
long distances. In the light of this, the desire was to find more accurate operators 
for the space derivatives. We have seen that extending the finite-difference oper- 
ators leads to more accurate derivatives. In a sense, the pseudospectral method 
can be considered the most extreme case in which the length of the derivative 
operator is equivalent to the number of points along one of the space dimensions. 

The attractive property of pseudospectral methods is that the space deriva- 
tives can be calculated exactly; that is, to at least machine precision. Of course this 
accuracy comes at a price. The price is that per calculation of space derivative 
many more floating-point operations have to be carried out. The two approaches 
we will introduce in what follows—the Fourier and the Chebyshev methods—can 
make use of the fast Fourier transform (FFT) to calculate derivatives. However, 
it is important to note that the pseudospectral method requires global communi- 
cation; in other words, the future of a certain point in the grid depends on the 
current state of all other points. 

The pseudospectral method was developed at a time when hardware archi- 
tecture still primarily favoured serial processing. Global communication schemes 
are suboptimal for massively parallel computer architectures that favour mini- 
mal and preferably near-neighbour communications. This is the main reason 
why the pseudospectral method in its simplest form disappeared from the sim- 
ulation market, when parallelization became a standard (some hybrid approaches 
are still used, and will be discussed here). However, it is important to note that 
because of the high accuracy of the space differentiation—requiring fewer grid 
points per wavelength compared to other methods—the pseudospectral method 
is very memory efficient. 

On the other hand, of all the (regular grid-type) methods presented in this 
volume, the pseudospectral method is my favourite simply for its mathematical 
elegance and the fact that it does not require grid staggering. This latter prop- 
erty made it attractive in particular for the simulation of wave propagation in 
3D anisotropic media or spherical sections (see Fig. 5.1), for which staggered 
finite-difference schemes suffer from additional errors. 

In terms of the evolution of computational seismology, the mathematical 
concepts introduced in connection with the pseudospectral methods (series 
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expansions, cardinal functions, exact interpolation, etc.) are elementary ingre- 
dients of methods that are state-of-the art today, such as the spectral-element 
method or the discontinuous Galerkin method. After a brief introduction to 
the history of the pseudospectral method, we will present the fundamentals of 
function interpolation using Fourier series and Chebyshev polynomials and then 
apply these concepts to the numerical solution of the acoustic and elastic wave 
equations. 


5.1 History 


Pseudospectral methods entered the arena in the early eighties as transform meth- 
ods because their implementation was based on the Fourier transform (Gazdag, 
1981; Kossloff and Bayssal, 1982). Later the term Fourier method was also used. 
Initial applications to the acoustic wave equation were extended to the elastic case 
(Kossloff et al., 1984), and to 3D (Reshef et al., 1988). Efficient time-integration 
schemes were developed (Tal-Ezer et al., 1987) that allowed large time steps to 
be used in the extrapolation procedure. 

The biggest attraction of the pseudospectral method based on Fourier 
transforms was the fact that, compared to finite-difference schemes, substantially 
less memory was required, in particular for 3D calculations. This was possible 
because a smaller number of grid points was required due to the high accuracy 
of the derivative calculations.? The drawback of using Fourier transforms is the 
implicit assumption of periodicity along the spatial dimensions. This implies that 
boundary conditions such as the free surface condition are difficult to implement 
efficiently. 

An elegant fix to this problem was achieved by replacing harmonic functions as 
bases for the function interpolation with Chebyshev polynomials (Kosloff et al., 
1990). The originally infinite area calculations (because of periodicity) were con- 
verted to limited area calculations as the Chebyshev polynomials are defined in 
the interval [-1,1] (easily scaled to arbitrary length). This formulation allowed 
efficient implementation of free surface or absorbing boundaries by means of 
characteristic variables (Carcione and Wang, 1993). 

However, as we all know, there is no free lunch, and a disadvantage also ac- 
companies the use of Chebyshev polynomials. The collocation points at which the 
functions are exactly interpolated are irregular and densify towards the bound- 
aries. The difference between shortest and largest grid point distance increases 
with the overall number of points along one dimension. In principle this can be 
compensated for by re-stretching the grid towards more regular grid distances 
(an approach presented in Carcione and Wang, 1993). 

‘To improve the accurate modelling of curved internal interfaces and surface 
topography, grid stretching (see Fig. 5.2) as coordinate a transforms was intro- 
duced and applied to acoustic and elastic wave-propagation problems (Tessmer 
et al., 1992; Komatitsch et al., 1996). A further advantage of the pseudospectral 
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Fig. 5.1 Computational grid for a 
pseudospectral simulation in spherical 
coordinates based on Chebyshev poly- 
nomuals. Note the decreasing grid point 
distance towards the boundaries of the 
physical domain. In combination with 
the Chebyshev formulation this allows 
elegant implementation of free-surface or 
absorbing boundaries. From Igel (1999). 
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Fig. 5.2 Illustration of grid stretching 
for a physical domain with curved 
boundaries. The grid lines are analyti- 
cally generated such that they follow the 
desired boundary. The Jacobian maps 
the original Cartesian to the curvilinear 
grid (e.g. Carcione and Wang 1993). 


? The reduction of the number of grid 
points per wavelength by a factor of 2 leads 
to a memory reduction by a factor of 8 in 
3D. In addition, because of the stability cri- 
terion, a time step twice as large reduces 
the overall computation time. 
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Fig. 5.3 Principle of the pseudospec- 
tral method illustrated with a simula- 
tion of acoustic waves in 1D through 
a medium with random velocity per- 
turbations. Bottom: Snapshot in space 
of the pressure field p. Top: Ampli- 
tude spectrum of the wavefield spectrum 
given below. From the sampling theorem 
the wavenumber axis is limited by the 
Nyquist wavenumber kn = m/dx. The 
first derivative 1s calculated by multiply- 
ing the spectrum by ik followed by an 
inverse Fourier transform resulting in an 
exact (to machine precision) derivative 
dx,p(x) at the regular grid points. 


approach is the fact that all fields (displacement, elastic parameters) are defined 
at the same grid locations (which is not the case in staggered-grid finite-difference 
schemes). This implies that in particular anisotropic media (Tessmer, 1995) can 
be efficiently implemented without the need for further interpolations. The same 
holds for the application of the pseudospectral method to the wave equation in 
other coordinate systems such as spherical coordinates (e.g. Igel, 1999). 

Despite the difficulties of using the pseudospectral method on parallel 
hardware, the method has been used for interesting seismological problems 
(Furumura et al., 1998a; Furumura et al., 19986; Furumura et al., 1999; Fu- 
rumura and Kennett, 2005) partly by mixing finite-difference operators and 
pseudospectral operators in the different spatial directions (Furumura et al., 
2002). 


5.2 The pseudospectral method in a nutshell 


In the introductory chapter we categorized the numerical methods presented in 
this volume roughly into grid point methods and series expansion methods. The 
pseudospectral method is both! One one hand, the spatial displacement field is 
expanded in (Fourier or Chebyshev) series; on the other hand, because of the spe- 
cific interpolation properties of the basis functions, the actual values at the corre- 
sponding grid points are (exactly) the coefficients of the basis functions. This will 
become clear when we discuss the details. As stated before, the time derivatives 
in the wave equation will be replaced by finite differences, leaving us with 


P(x, t + dt) —2p(x, t) + p(x, t—dt) _ 
dt? 


for the acoustic wave equation. The remaining task is to calculate the space deriva- 


c(x)782 p(x, £) + s(x, 0) (5.1) 


tive on the right-hand side. In general, the mth derivative of a space-dependent 
function can be expressed as 


3.” p(x, t) = F"[(ik)" P(A, 0] (5.2) 


where 7 is the imaginary unit, .7~! is the inverse Fourier transform, and P(A, t) is 
the spatial Fourier transform of the pressure field p(x, 2), k being the wavenum- 
ber. When using the discrete Fourier transform of functions defined on a regular 
grid (which is the case in our applications), we obtain exact (machine precision) 
derivatives up to the Nyquist wavenumber ky = 2/dx (two points per wave- 
length). The price to pay is the forward and inverse Fourier transform, which, 
depending on the number of grid points along the physical dimension, requires 
substantially more floating point operations than the finite-difference approach. 

The principle of the pseudospectral method based on the Fourier series is 
illustrated in Fig. 5.3. The use of sine and cosine functions for the expansions 
implies periodicity at the boundaries of the physical domain. 

This is not the case in most geophysical applications. In addition, the 
common boundary conditions (free surface, absorbing) are basically impossible 


to implement with similar (almost perfect) accuracy compared with the deriva- 
tives inside the medium. To some extent this can be improved by using other basis 
functions with similar interpolation behaviour such as Chebyshev polynomials. 
As they are defined in the interval [-1, 1] they are easily adapted to limited-area 
calculations, and an efficient implementation of boundary conditions is possible. 

The pseudospectral method uses a mathematical principle (exact interpola- 
tion at grid points) that was later used extensively in the spectral-element method 
Gn combination with the corresponding numerical integration scheme); there- 
fore it deserves a prominent place in the history of computational seismology. In 
the end—despite its accuracy, the high memory efficiency, and its elegance—it 
did not replace the finite-difference method for geo-scientific applications. The 
reason is the communication-intensive algorithm that is difficult to implement 
efficiently on many-core systems. 


5.3 Ingredients 


When it was introduced to seismic wave propagation, the pseudospectral method 
was considered more complicated than finite differences (Kossloff and Bayssal, 
1982), even though, by comparison with spectral elements or the discontinuous 
Galerkin method, the actual implementation using the often intrinsic fast Fourier 
transforms (FFT) is short and elegant. What is important to note is that the math- 
ematical approach is indeed very different, and digs deeper into the world of 
numerical mathematics. Readers familiar with the concepts of function interpola- 
tion may jump directly to the actual solution of the wave equation in Section 5.4. 
However, the concepts of exact interpolation on specific spatial grids, cardinal 
functions, etc. play such an important role in the other methods, that we continue 
with a brief introduction. 


5.3.1 Orthogonal functions, interpolation, 
derivative 


In many situations, not only in natural sciences, we either (1) seek to approximate 
a known analytic function by an approximation, or (2) know a function only at a 
discrete set of points and would like to interpolate in between those points so that 
we have a representation everywhere. Let us start with the first problem and pose 
it such that we seek to approximate our known function by a finite sum over some 
N basis functions ®;: 


. 
f@) © gn(x) =D) aii(x) (5.3) 


1=0 


and assume that the basis functions form an orthogonal set.? 
Why would one want to replace a known function with something else? In 
many fields of science dynamic phenomena are expressed by partial differential 
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3 What are orthogonal functions? Func- 
tions f(x) and g(x) are orthogonal in the 
b 


interval [a,b] if f f@g(x)dx = 0. The 


a 
concept of orthogonality is more easily 
grasped with vectors. Thinking of a dis- 
crete representation f(x) as f; and g(x) as 
and g; as vectors of length n the integral 
can be interpreted as an n-dimensional 
scalar vector product with the number of 
elements going to infinity. 
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Fig. 5.4 Examples of non-differentiable 
functions (in the classic sense). a: Heav- 
iside function; b: ramp; and_ their 
derivatives c: spike; d: boxcar function. 
All the functions illustrated play a role in 
seismology as possible source time func- 
tions for earthquakes or explosions. 


equations. This implies that in order to find solutions we must be able to deal 
with derivatives. Yet, in many cases either nature does not do us the favour of 
being smooth and differentiable (e.g. layered Earth models with sudden changes 
of physical properties, sudden bursts of energy, such as explosions) or there are 
mathematical reasons to deal with functions that are non-differentiable (e.g. turn- 
on-turn-off phenomena in circuits, saw-tooth-like behaviour (see Fig. 5.4)). 
Once we find approximations to the original function with sufficient accuracy 
(the criteria will be defined below), we are in good shape. With the right choice 
of differentiable basis functions ®; the calculation of the (approximate) derivative 
becomes trivial as 
N 
Oxf (x) © deg (x) = Yo aide;(x). (5.4) 
i=0 
There are many possible choices for basis functions and we will encounter a va- 
riety of them (e.g. Chebyshev polynomials, Lagrange polynomials, radial basis 
functions) in the course of this volume. Let us start with the most commonly 
used trigonometric basis functions (at least in spectral analysis). Consider the set 
of basis functions 


cos(mx n=0,1,...,00 
( ) a) | (5.5) 
sin (nx) n=0,1,...,00, 
with 
1, cos(x), cos(2x), cos(3x),... (5.6) 


0, sin(x), sin(2x), sin(3x),... 


in the interval [-77, 7]. We can proceed by checking whether these functions are 
orthogonal by evaluating integrals with all possible combinations 


x 0 forj#k 
/ cos(jx) cos(kx)dx = 42m forj=k=0 
— x forj=k > 0 


u 


/ sin(jx) sin(kx) dx = | 


—t 


0 forj#k j,k > O (5.7) 
x forj=k > 0 
x 
[ cose sin(kx)dx =O forj > 0,k > O 
=n 
to find that these trigonometric functions form indeed an orthogonal set (see 
Fig. 5.5 for an illustration). Our problem can consequently be stated as finding 
an approximate function gy (x) so that 
N 


S(x%) & gn (x) = pa Ap COS(kx) + bz sin(kx). (5.8) 
k=0 


How can we find the coefficients a,,b, given function f(x)? An obvious way 
of posing this problem mathematically is to seek coefficients that minimize the 
difference between approximation gy (x) and the original function f(x). However, 
there are many ways of defining the difference (i.e. distance, norm) between two 
functions. The preferred choice is the so-called 4-norm’ basically quantifying the 
misfit-energy between the two functions: 


2 


b 
If) -en@lly = / f(@) -en@)}? dx |= Min. (5.9) 


This equivalence is independent of the choice of basis functions. In the case of 
trigonometric basis functions this requirement leads to the well-known formula- 
tions of Fourier series and the Fourier transform that will be discussed in the next 
section. 


5.3.2 Fourier series and transforms 


The concepts of Fourier series and transforms are so central to seismology (data 
processing, spectral analysis, instrument correction, filtering, etc.) that, even 
though here they are used just to calculate space derivatives efficiently, we will 
present the fundamental equations. The general concepts introduced above— 
approximating a function in a certain interval by summing over weighted basis 
functions—already implies a discretization, as we assume that the sum is finite. 
The most important concept of this section will consist of the properties of 
Fourier series on regular grids. It is important to note that we really only scratch 
the surface of Fourier analysis here, and the interested reader is referred to the 
literature suggested at the end of this chapter. 

Let us start by presenting the result of the minimization problem presented in 
Eq. 5.9. The requirement that we approximate a 27 -periodic arbitrary function in 
the interval [-7, 2] (this can be relaxed) by a sum over sine and cosine functions 
of the form 


1 n 
2Nn (xX) = =a + SS Ap COS(Rx) + bz sin(Rx) 


5.10 
5 (5.10) 
k=1 
leads to the coefficients 
1 au 
ap = = | £00) cos(bsat k=0,1,...,N 
1 
se (5.11) 
1 
by = = | £00) sina) k=0,1,...,N. 
1 


Note that & is the wavenumber 277/X. In light of this, the coefficients can be in- 
terpreted as the amplitude of the harmonic waves that make up the function f(x) 
and thus describe its spectral content. 
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Fig. 5.5 Illustration of orthogonal func- 
tions using cosine functions. f(x) = 


cos(nx) is shown for n = 1...30 for 


increasing wavenumber n. 


+ This misfit criterion is also commonly 
used for solving inverse problems in seis- 
mology. In this case the problem is formu- 
lated as minimizing the difference between 
theoretical calculations (e.g. travel times, 
polarities, waveforms) and observations. 
The coefficients that are sought are phys- 
ical parameters of the Earth’s interior, or 
seismic sources. 


122. The Pseudospectral Method 


Sx) 


Sx) 


x 


Fig. 5.6 Illustration of Fourier series. 
Top: The function f (x) =x? defined in 
the interval [0,21] (dotted line) is ap- 
proximated by Fourter series of increas- 
ing order N=2-8 (solid lines). The 
approximation gn(x) oscillates around 
the exact function. Bottom: Detail of 
the approximation behaviour towards 
the upper boundary for N = 8-24. Note 
the unavoidable overshoot of the approx- 
imation at the boundary caused by the 
periodicity requirement (the Gibbs phe- 
nomenon). 


This formulation of Fourier series can be written in an elegant way using 
Euler’s formula to obtain 


h=N 
n(x) = Do cae, (5.12) 
k=-N 
with complex coefficients c, given by 
Ck = 5 (ap — tb) 
Car = 5 (aut iby) k>0, (5.13) 
co = $.a0. 


while a, and b; are obtained through Eq. 5.11. 
Let us see how the approximation in the continuous case works by finding the 
interpolating trigonometric polynomial for the 27-periodic function 


f(x+2nn) =f (x) =x’, x € [0,27], nEN. (5.14) 
Applying Eq. 5.11 we obtain (see exercises) the approximation gy (x) 
An? Se 4a. 
2Nn (x) = a + a E cos(kx) — - sins) (5.15) 


k=1 

where N is the number of terms used for the approximation. The approximation 
behaviour is shown in Fig. 5.6. While in general the approximation wraps around 
the original function and seems to converge, the overshoot at the boundary raises 
some questions about the actual convergence behaviour of Fourier series for in- 
creasing N. In fact it is important to note that while Fourier postulated that 
arbitrary (i.e. even non-differentiable) functions could be represented by a single 
analytical expression (the Fourier series) in the late eighteenth century, the idea 
found substantial resistance amongst mathematicians. Today Fourier series and 
all related concepts such as the Fourier transform are ubiquitous in all fields of 
science. However the question of convergence remained open until the twentieth 
century! 

‘To make use of these concepts for the numerical solution of partial differential 
equations let us move to the discrete world. We assume that we know our function 
f(x) at a discrete set of points x; given by 


ray NG (5.16) 


Using the ‘trapezoidal rule’ for the integration of a definite integral we obtain for 
the Fourier coefficients 


3X 
a, = N Yo Fx) cos(kx;) 


j=l 


(5.17) 


N 
= YFG) since) (5.18) 


j=l 


where the upper asterisk denotes the discrete case. Note that the integrals have 
been replaced by sums over (weighted) values of function f at grid points x;. We 
thus obtain the specific Fourier polynomial with N = 2n 


N-1 
1 1 
§N =! ant Yi fa; cos(kx) — 6; sin(kx)} + =ax, cos(Nx), (5.19) 
2 = 2 
with the tremendously important property that 
&n (xi) = f (xi). (5.20) 


This behaviour is illustrated in Fig. 5.7. At the discrete points x; (in fact the inte- 
gration points for the calculation of the Fourier coefficients), the approximating 
function exactly (that means to machine precision including possible rounding er- 
rors) interpolates the original function f(x). Even though you might argue that the 
function is not well represented in between the grid points (see Fig. 5.7), in the 
context of our desire to solve a partial differential equation on a discrete grid we 
do not really care. We are only interested in the function itself and its derivatives 
at the grid point locations. In formulations that require the evaluation of integrals 
(e.g. finite-element-type methods) this is a different story! 

At this point it is instructive to introduce the concept of cardinal functions, 
which will play an important role in some of the other methods that we will en- 
counter. The fact that we exactly recover the original discrete function at the grid 
points is thanks to the specific form of the cardinal function when interpolating 
with trigonometric basis functions. It is the result of approximating a spike func- 
tion (equivalent to a delta function in the continuous world) with Fourier series. 
The concept is illustrated in Fig. 5.8. Discrete interpolation and derivative oper- 
ations can also be formulated in terms of convolutions (this should become clear 
after we introduce the Fourier transform and its properties). Cardinal functions 
are used in the convolutional formulation for the interpolation problem. 

What is missing in our discussion on orthogonal functions and function ap- 
proximations is the calculation of derivatives needed to solve the elastic wave 
equation. To introduce the powerful concept of spectral derivatives, we go back to 
the continuous world for a moment and introduce the continuous Fourier trans- 
form, which generalizes the concepts of Fourier series. One possible definition for 
the Fourier transform of function f(x) is 


1 4 ; 
FR) = Ff] = Jin / foe" dx, (5.21) 


where F(&) is the spectrum of the original space-dependent function f(x) in the 
wavenumber domain k.° In terms of spectral content the absolute values of the 
complex Fourier transform |F(k)| correspond to the spectral amplitudes. This 
is called the forward transform in the sense that we transform from the common 
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f(x) 


Sx) 


Fig. 5.7 Discrete Fourier series. Top: 
Approximation of the function f (x) = x? 
known at a discrete set of N = 16 points 
indicated by ‘+’. Bottom: Detail with 
an illustration of the exact interpola- 
tion property at the so-called collocation 
points. 


> The various definitions differ in par- 
ticular concerning the sign of the exponent 
and the factors in front of the integral. 
When working with intrinsic numerical 
implementations of the Fourier transform 
it is always advisable to carefully check 
which formulation is being used! 
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N= 16 


&Nn(*) 


x 


Fig. 5.8 Illustration of cardinal func- 
tions. The function shown in this graph 
is the interpolating (cardinal) function 
for grid point f; = 1,1 = N/2. Note that 
it is unity at grid point x; and zero at 
all other points on the discrete grid de- 
noted by ‘+’. The cardinal function has 
the form of a sinc function. 


physical domain (space x or time f) to the spectral domain (spatial wavenumber 
k or temporal frequency f). From now on we will denote the forward transform 
of f(x) (or its discrete representation) by ¥[f(x)]. Note that we attempt to use 
a consistent notation for values defined in continuous or discrete physical space 
with small letters and values defined in the spectral domain with capital letters, 
respectively. To get back from the spectral domain to the physical space we apply 
the inverse transform, which we denote as #7! [F(k)] 


f(x) = F [F(R] = F(R) e* dk, (5.22) 


1 oO 
Vv 20 [ 
with F(z) being the complex spectrum. Note the infinite integral boundaries that, 
because of the convergence behaviour, imply the equivalence of the representation 
of function f(x) in the physical space and the spectral domain. 

‘Taking the formulation of the inverse transform (Eq. 5.22) it is straightforward 
(see exercise) to obtain the derivative of function f(x) with respect to the spatial 
coordinate 


d 
ae 


(x) F(R) e** dk 


d 1 [ 
ax V 20 -0°0 
= —— i ik F(k)e™ dk (5.23) 


OO 


eee / ” De) F(k)e"™ dk, 


with D(k) = 7k corresponding to the derivative operator in the spectral do- 
main. Intuitively, in the continuous world—with infinite integration domain—the 
derivative should be exact (in the sense of convergence behaviour). Equivalently 
we can extend this formulation to the calculation of the mth derivative of f(x), to 
obtain 


F (k) = D(k)" F(R) = (ik)" F(R), (5.24) 


followed by an inverse Fourier transform to return to physical space. Thus using 
the Fourier transform operator ¥Y we can obtain an exact nth derivative using 


*[Gk)" F(R) 
‘[GR)" FOI. 


FO) = 


ae 
as (5.25) 


How does this work in the discrete world? With what accuracy can we obtain 
a derivative for a function defined on a discrete set of points as introduced in 
connection with Fourier series? To answer these questions let us write down the 
discrete Fourier transform that is so widely used for data analysis, filtering, etc. 
in all areas of science. Again there are several possibilities concerning the — signs. 
Adopting the complex notation of the forward transform we obtain 


N-1 
Bes fe F204 sN A (5.26) 


j=0 


and the inverse transform 
, M2 
hay Dee = ONAL (5.27) 
k=0 


essentially the complex formulation of Eq. 5.19 with the same interpolation prop- 
erties. Here jf; is the vector describing the space-dependent function (e.g. the 
seismic wave field) and Fy is its complex wavenumber spectrum. 

In this formulation the number of calculations to be carried out for a Fourier 
transform of a vector with length N is proportional to N?. In terms of overall num- 
ber of operations in connection with long vectors of multi-dimensional transforms 
this is serious. An ingenious exploitation of symmetry properties introduced by 
Cooley and Tukey (1965)° reduces the proportionality to Nlog N. It is worth 
exploring the consequence of this improvement for realistic 3D calculations (see 
exercises). 

By analogy with the continuous formulation of the Fourier transformation, 
noting that we exactly interpolate at the collocation points, we are able to obtain 
exact (to machine precision) mth derivatives on our regular grid by performing the 
following operations (here the ¥Y operator stands for the discrete Fourier trans- 
form often realized by the fast Fourier transform) on vector f; defined at grid 
points x;: 


If = F "(GR)" Fils (5.28) 
where 
Fy = F [fils (5.29) 


and we used the partial derivative symbol since the discrete space-dependent 
function might also depend on time (as is the case for the displacement field 
in the wave equation). 

Let us see how this works in practice and take an example. We initialize a 
27 -periodic Gauss function in the interval x € [0, 27] as 


f(x) = tle? @-x0)? (5.30) 
with x9 = m and the derivative 
a (x— x9) -lfa2 (x 2 
f Oj S22 (5.31) 
oO 


allowing us an easy evaluation of the numerical accuracy of the Fourier-based 
derivative. The vector with values fj is required to have an even number of 
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® You don’t always have to find a new 
particle, or discover plate tectonics or rela- 
tivity, to leave your mark! This four-page 
FFT paper probably beats them all in 
terms of citations! 
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f(x) 


Oxf) 


Fig. 5.9 Illustration of Fourier deriva- 
tives. Top: Periodic Gauss function f(x) 
defined in the interval [0,27]. Bot- 
tom: Superposition of the analytical 
derivative (solid line) and the numerical 
derivative (dot-dashed line) based on the 
fast Fourier transform. The differences 
are indistinguishable, therefore they are 
illustrated by the dotted line, multiplied 
by a factor 1017! This is the expected 
level of inaccuracy due to rounding er- 
rors at double precision (8 bytes per 
number). 


uniformly sampled elements. In our example this is realized with a grid spacing 

of dx = 2n/N with N = 128 and x; =j 2n/N, j = 1,...,N,o0 = 0.5, and xp = 7. 
The results are shown in Fig. 5.9. They were obtained using the intrinsic fast 

Fourier transform (mumpy library) with the following Python commands: 


# [...] 

# Basic parameters 
nx = 128 

x0 = pi 


def fourier derivative(f, dx): 
# Length of vector f 
nx = f.size 
# Initialize k vector up to Nyquist wavenumber 
kmax = 


dk = 


pi/dx 
kmax/ (nx/2) 


k = arange (float (nx) ) 
k[:nx/2] = k[:nx/2] * dk 
k[nx/2:] = k[:nx/2] - kmax 


# Fourier derivative 
ff = 1j*k+*fft(f) 
af = ifft(ff).real 
return df 

# [...] 

# Main program 


# Initialize space and Gauss function (also return dx) 


x = linspace(2*pi/nx, 2*pi, nx) 
dx = x[1]-x[0] 

sigma = 0.5 

f = exp(-1/sigma**2 * (x - x0) **2) 


# Calculate derivative of vector f 
df = fourier derivative(f, dx) 


# [...] 


Note that 17 represents the imaginary unit. The reader is referred to Python 
tutorials or the supplementary electronic material. 

This calculation of the derivative based on the Fourier transform is very ele- 
gant. What needs some care is the correct initialization of the wavenumber axis 
k, that is, the imaginary part of the derivative operator. The specific form of this 
vector is defined by the requirements of the fast fourier algorithm and the way 
the frequencies are arranged. The reader is referred to the documentation of the 
specific Fourier transform routines. 

From Fig. 5.9 we infer that the numerical approximation of the derivative is 
indeed obtained with errors only coming from round-off problems caused by the 
bit depth of the floating-point definition (here double precision). This is as good 


as it gets, at least for the conditions under which these numerical calculations were 
carried out (regular spacing, periodicity). We will proceed with assembling this 
powerful tool to obtain numerical solutions to the wave equation. 


5.4 The Fourier pseudospectral method 


With the simple recipe to calculate exact nth derivatives on a regular-spaced grid 
(assuming periodicity) we now have all it takes to solve a wave equation-type 
problem. For the moment we disregard the implementation of spatial boundary 
conditions. Even though our main focus in this volume is to illustrate the concepts 
in 1D, we proceed in a similiar way as in the previous chapter presenting first the 
acoustic 1D and 2D cases. The reason is that the numerical solutions can still be 
treated analytically and important results on numerical dispersion emerge. This 
will be followed by the elastic 1D case. 


5.4.1 Acoustic waves in 1D 


Assuming implicitly the space-time dependence, the constant-density acoustic 
wave equation in 1D is given as 


a2p=Cd2pts (5.32) 


and we seek solutions for pressure field p(x, 1) for a velocity model c(x) with 
Gin principle) arbitrary heterogeneous variations. The source is injected through 
s(x, t). The time-dependent part is solved using a standard three-point finite- 
difference operator, leading to 


pr = 2p" eh pr 


qe = Fate +3 6.33) 


where upper indices represent time and lower indices space. The remaining task 
is to calculate the second derivatives on the right-hand side. 

Based on the developments in the previous sections we proceed by calculat- 
ing the second derivatives using the Fourier transform (in practice this is usually 
realized by applying the discrete fast Fourier transform): 

arp} = ¥" (Gk)? P") = FR? PP), (5.34) 
leading (within some limits) to an exact derivative with only numerical round- 
ing errors. Here, P? is the discrete complex wavenumber spectrum at time n. 
As a consequence the main overall error of the numerical solutions comes from 
the time integration scheme. The following Python code snippet illustrates the 
compact algorithm that results when this operation is carried out with the help 
of a function (or subroutine) calculating the mth derivative based on Fourier 


transforms.” 
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7 In the code below space-dependent 
fields (pressure p, function f, complex 
spectrum f/f, spatial source function sg, 
numerical derivative df, second derivative 
d2p, pressure time levels pold and pnew) 
are vectors of length [mx] where mx is the 
number of grid points. j is the imaginary 
unit. 
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Table 5.1 Simulation parameters for 


1D acoustic 
Fourier method 


simulation with the 


Parameter 


Xmax 


Nx 


Value 


1,250 m 
2048 

343 m/s 
0.00036 s 
0.62 m 
60 Hz 
0.2 


# [...] 
# Fourier derivative 
def fourier derivative 2nd(f, dx): 


Ho [5] 
# Fourier derivative 
ff = (15 * k)**2 * fFft(f) 


af = ifft(ff).real 
return df 
# [...] 
# Time extrapolation 
for it in range(nt): 
# 2nd space derivative 
d2p = fourier derivative 2nd(p, dx) 
# Extrapolation 
pnew = 2 * p - pold + c**2 * dt**2 * d2p 
# Add sources 
pnew = pnew + sg * src[it] * dtx*x*2 
# Remap pressure field 
pold, p = p, pnew 
# [...] 


The function fourier_derivative_2nd returns the second derivative of function f 
discretized with grid increment dx. The source is injected via a smooth space- 
dependent field sg of Gaussian shape to avoid the Gibbs phenomenon. Otherwise 
the code differs from the finite-difference solution only in the calculation of 
the space derivatives. Let us take an example and compare the result with 
the finite-difference method. The parameters for the simulation are given in 
Table 5.1. 

Before showing the results we need to discuss a specific feature of source in- 
jection when series-based methods are used. In the case of finite differences it 
was possible and straightforward to initiate a point-like source at one grid point. 
This is no longer the case for pseudospectral methods. The Fourier transform 
of a spike-like function creates oscillations that damage the accuracy of the so- 
lution. To avoid this it is common practice to define a space-dependent part of 
the source using a Gaussian function. In the example shown in the following a 
Gaussian function ¢7!/7°-*0)” was used for pseudospectral and finite-difference 
algorithms with o = 2dx, dx being the grid interval and x9 the source location. To 
match with the analytical solution the integral over this function should be scaled 
to 1/dx (see previous chapter). 

The results of a simulation in a homogeneous medium are shown in Fig. 5.10 
for various propagation distances given in terms of dominant wavelengths nd. It 
is instructive to compare the pseudospectral results with those obtained with the 
finite-difference approximation of the space derivatives using exactly the same 
set-up. The results indicate that while the pseudospectral solutions show very 
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small increasing effects of numerical dispersion with distance, the finite-difference 
solutions are too inaccurate to be usable. However, note the considerable im- 
provement of five-point over the three-point operator already discussed in the 
previous chapter. Another observation is that while the dispersive waves in the 
finite-difference method are slower (arrive later), the small dispersion visible in 
the pseudospectral results arrive earlier (are faster). 

Results like the ones shown in Fig. 5.10 seem to suggest that the pseudospec- 
tral method is far superior to the finite-difference method, as was often claimed 
in the first papers.® In fact, so far we have only looked at part of the story and the 
comparison is not entirely fair. First, we used the same number of grid points in 
all cases. That means that the number of floating point operations and thus run 
time is much higher in the case of the Fourier method. Second, we used the same 
stability criterion for all simulations, which creates a disadvantage for the finite- 
difference method. We will dwell more on comparing solutions from different 
methods when we introduce the elastic case. 

To understand what is happening, let us proceed in an analoguous way 
to the finite-difference method and seek analytical solutions for the numerical 
algorithm. 
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Fig. 5.10 Application of the Fourier 
method to the acoustic wave equation: 
comparison with finite differences. Sim- 
ulation results after propagation dis- 
tances given in terms of wavelengths. 
Left column: finite differences with 
three-point operator. Middle column: 
finite differences with five-point opera- 
tor. Right column: Fourier method. 
The correct waveform corresponds to 
the shape in the top-right graph. All 
windows are scaled to the same space 
and amplitude interval. Note the su- 
perior stability of the waveform for the 
Fourier method and the strong numer- 
ical dispersion for both finite-difference 
implementations. 


8 In fact, several methods were intro- 
duced with the notion that they were su- 
perior to others. In some cases, when 
more issues were taken into account, these 
claims had to be substantially revised. 
The lesson is that you have to be very 
careful when comparative statements be- 
tween methods are made (including in this 
volume). 
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Fig. 5.11 Numerical 
Fourier method. The numerical phase 


dispersion of 


velocity 1s shown as a function of 
number of grid points per wavelength for 
various CFL criteria (€). The correct 
propagation velocity is 2,000 m/s. As 
the number of grid points per wavelength 
increases the correct velocity 1s recovered. 
Note that the error due to temporal 
discretization leads to higher phase 
velocities (as opposed to a decrease, in 
the case of finite differences). 


5.4.2 Stability, convergence, dispersion 


An effective way of understanding the behaviour of numerical approximations is 
to seek solutions of the algorithms using discrete plane waves of the form 


n _ i(kjdx—wndt) 
p; =e 
fi 


(5.35) 
ap} = 


= ke e (kjdx—wndt) 

= 
where the second space derivatives are given in the way they are calculated 
using Fourier transforms. Following the developments in the chapter on finite 
differences, the time-dependent part can be expressed as 


n 4 wdt i(kjdx-—an. 
a7 pF =a sin? (S) ef (Ridx-ondt) | (5.36) 
where we made use of the Euler formula and the fact that 
1 
2 sin? x = 3 (1 —c08 2x). (5.37) 


This is the so-called von Neumann analysis and we proceed in the same way as 
before and replace the partial derivatives in the wave equation (Eq. 5.32) with 
these results, and extract the frequency to obtain a formula for the phase velocity 
c(k) after division by wavenumber k: 


pe 2  . _, (kedt 
c( =p pap > : 


This result has some important consequences. First, when dt becomes small, 
sin! (kedt/2) ~ kcdt/2 and we recover c = w/k as the analytical phase veloc- 
ity. This demonstrates the convergence of the scheme and an important aspect 


(5.38) 


is that the space increment dx does not appear in this equation (as was the case 
in the finite-difference method), with the result that making the time step smaller 
always decreases the error of the overall solution. Second, as the argument of the 
inverse sine must be smaller than one, the stability limit requires R,,4,(cdt/2) < 1. 
AS Ringx = 1/dx (sampling theorem) the stability criterion for the 1D case is 
€ = cdt/dx = 2/m © 0.64. 

Eq. 5.38 allows us to calculate the numerical phase velocity as a function of 
the number of grid points used for various stability criteria. This dispersion curve 
illustrated in Fig. 5.11 shows the tremendous accuracy of the phase velocity in 
the case of € = 0.2 for more than ten grid points per wavelength. The results 
also indicate that the error due to time discretization leads to an increase in the 
phase velocity with decreasing number of grid points per wavelength. This is the 
behaviour we saw in the numerical simulation presented in Fig. 5.10. 

One of the features we discovered for finite-difference approximations to the 
wave equation in higher dimensions was the fact that the error of the wavefield de- 
pends on the propagation direction (numerical anisotropy). How does the Fourier 
method behave in this context? To answer this question we now turn to the 2D 
acoustic case. 
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5.4.3 Acoustic waves in 2D 


The acoustic wave equation in 2D reads 
ap = (2p + A2p) +5, (5.39) 


where the difference with the 1D case is simply the fact that all space-dependent 
fields are now defined in (x, z), stored in the computer as matrices, and structure 
and source are translationally invariant in the third dimension (here: yy). As in the 
previous section we replace the time-dependent part by a standard three-point 
finite-difference approximation and extrapolate the pressure field accordingly: 


n+1 n n-1 
Die ~2Djn t+ Dip 
dt? 


= G (OP + OZD)jn + Sty. (5.40) 


We are left with approximating the second partial derivatives with respect to x 
and z (denoted by indices j and k, respectively) using the Fourier approach. The 
sequence of operations to achieve this is 


apt a2p= F"[-k, F [pl] + FER: F lp, (5.41) 


where the discrete Fourier transforms are usually calculated with the fast Fourier 
transform. In practice, as is the case in the finite-difference method, we have to 
loop through all grid points in x and zg and independently calculate the space 
derivatives, keeping one of the coordinates constant, while calculating the deriva- 
tive with respect to the other. This is illustrated in the following code snippet: 


# [...] Table 5.2 Simulation parameters for 
# second space derivatives 2D acoustic simulation with the 
for j in range(nz): Fourier method 
d2px[:,j] = fourier derivative 2nd(p[:,j].T, dx) 
for i in range(nx): Parameter Value 
d2pz[i,:] = fourier derivative 2nd(pl[i,:], dx) 
: = = Nena 200 m 
# Extrapolation 
nx 256 
pnew = 2 * p - pold + c**2 * dt**2 * (d2px + d2pz) 
4 | Cc 343 m/s 
fake dt 0.00046 s 
; ‘ ; ; ; : dx 0.78 m 
where we assume identical space increments dx in both directions. In order to fi 200 Hz 
investigate the wave field behaviour in various directions we perform a simulation é 0.2 


with parameters as listed in Table 5.2 and compare with the finite-difference solu- 


tion using a five-point operator for the space derivatives. The snapshots shown in 
Fig. 5.12 indicate that (1) there is strong anisotropic dispersion behaviour visible 
for the finite-difference solution with the most accurate direction at 45° to the co- 
ordinate axes, and (2) the Fourier solution shows weak signs of dispersion—but 
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Fig. 5.12 Comparison of Fourier and 
finite-difference method in 2D: Left: 
Snapshot of wavefield simulated with the 
Fourier method. The wavefront is spher- 
ical and does not show dispersive be- 
haviour. Right: Snapshot obtained with 
the finite-difference method with exactly 
the same set-up (which is not fair to 
finite differences). Strong anisotropic dis- 
persion behaviour is visible. Note, how- 
ever, that the Fourier method requires 
substantially more calculations. 


° There exists an isotropic grid in 2D 
that is hexagonal; in 3D there is no such 
thing as an isotropic grid. 


Finite-difference method 


Fourier method 


the most important observation is that there does not seem to be a directional 
dependence: the error is isotropic. 

Looking at the simulation parameters, the wave field is actually sampled only 
with 2-3 grid points per wavelength. This is not entirely true as there is a 
Gaussian-shaped spatial source function that acts as a low-pass filter. Neverthe- 
less, it is clear that with this set-up we are way beyond the healthy operating 
range for a low-order finite-difference approximation. Again, please note that the 
point here is not to sell the Fourier method as the better choice! The point is 
to illustrate an interesting property, namely an isotropic dispersion error for the 
Fourier method despite the fact that the cubic grid is not isotropic.” 

This isotropic error behaviour is certainly an advantage when simulating phys- 
ical anisotropy that is often weak. Then, the physical effects cannot be confused 
with numerical anisotropy caused by discretization. To bring this point home, let 
us look at the analytical result for the dispersion problem in 2D. 


5.4.4 Numerical anisotropy 


‘To investigate the dispersion behaviour of our numerical scheme based on the 
Fourier method we proceed by finding solutions to monochromatic plane waves 
propagating in the direction k = (R,, k,): 


Pt, - el haidxtkekdx-ondt) (5.42) 
assuming regular discretization in space and time with increments dx and dt, re- 
spectively. In the case of the Fourier method the derivatives can be calculated 


exactly, leading to the following formulation: 


Te 2 i(kyjdx+kzkdx-wndt 
IP” p = -k2 ei Rei ) 


‘ (5.43) 
dep%, = 


2 i(kyjdxtkzkdx—ondt 
—k, e 
e ' 


Combining this with the analytical form of the three-point operator for the time 
derivative (see previous sections) 


4 dt se 
arp", = | sin? (“) el (Rxsdx+ ke kdx—wndt) (5.44) 


we obtain the numerical dispersion relation in 2D for arbitrary wave number 
vectors (i.e. propagation directions) k as 


7) 2 cdt,/k2 + k2 
k) = —s in (ey, 5.45 
c(k) Ik] ea “ 5 ) (5.45) 


The direction-dependent error of the phase velocity is illustrated in Fig. 5.13, 
confirming the observation of our 2D simulation example. The error is isotropic 
and in the case of 10 points per wavelength below 0.5%. 


5.4.5 Elastic waves in 1D 


Finally we want to solve the 1D elastic case 


p(x) 07 u(x, t) = dy [W(x)d,u(x, 0] +f); (5.46) 
which contains a sequence of first derivatives with respect to space of the 
displacement field w and the space-dependent shear modulus jy, and combina- 
tions thereof. Again, the finite-difference approximation of the extrapolation part 
leads to 


i+1 fe ged 
ul — 2u, + ut 


1 


pi ae = (dy [uw (x) aux, 2), +f, 


(5.47) 


with space derivatives to be calculated using the Fourier method. The sequence 
of operations required to obtain the right-hand side of Eq. 5.46 (without sources) 
reads 

[uw] > Ul > -ikU! > FARUI] > dgul, 

[oidxed] > O) > F"[ARO}] > a; [u(x du, 0] 


7 


(5.48) 


ul >F 
0, > F 
where, as a reminder, capital letters denote fields in the spectral domain, lower 


uw. was 


indices with Greek letters indicate discrete frequencies, and U? = [jdxt; 


introduced as an intermediate result to facilitate notation. It is important to note 
that, as demonstrated in the preceding sections, this entire result is accurate to 
machine precision (provided the space-dependent fields obey the requirements 
for the discrete Fourier transform to be accurate). Note also that, in comparison 
with the acoustic case, two Fourier transform operations are necessary, basically 
doubling the computational effort to evaluate the space-dependent part of the 
wave equation. 

Even though you might get bored with looking at solutions to the homoge- 
neous problem, it is actually instructive to investigate the performance difference 
to the finite-difference method. Throughout this volume I will stress again and 
again that a fair comparison between methods is extremely difficult. However, 
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Fig. 5.13. Numerical anisotropy of the 
Fourier method. The error of the phase 
velocity is shown as a function of propa- 
gation direction (in %) for varying num- 
bers of grid points per wavelength. Note 
that there 1s no directional dependence 
of the phase velocity error. Thus, unlike 
with the finite difference method, there is 
no numerical anisotropy. 


Table 5.3 Simulation parameters for 
1D elastic simulation with pseudospec- 
tral Fourier method (PS), comparison 
with finite differences (FD) 


FD PS 
nx 3,000 1,000 
nt 2,699 3,211 
Cc 3,000 m/s 3,000 m/s 
dx 0.33 m 1.0m 
dt 5.5e-5s 4.7e-Ss 
to 260 Hz 260 Hz 
€ 0.5 0.14 
n/r 34 11 
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FD (5-pt), run time: 2.50875 s 


Fourier, run time: 3.518 s 
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Fig. 5.14 Elastic Fourier method in 
1D: comparison with finite differences. 
An attempt is made to compare memory 
requirements and computation speed be- 
tween the Fourier method (bottom) and 
a fourth-order finite-difference scheme 
(top), solving the same problem. In both 
cases the relative error compared to the 
analytical solution (misfit energy) is ap- 
proximately 1%. The run time is compa- 
rable. The big difference 1s the number of 
grid points along the x dimension. The 
ratio 1s 3:1 (FD:Fourier). 


the comparison can serve to illustrate some specific aspects of one method vs. the 
other, which are not usually generalizable to arbitrary applications. 

We proceed by finding a set-up for a classic staggered-grid finite-difference 
solution to the elastic 1D problem (see previous chapter) that leads to an energy 
misfit to the analytical solution wu, of 1% (which is commonly required for realistic 
simulations). The energy misfit is simply calculated by (upp — ua)?/u2. Then we 
adjust the set-up for the Fourier solution such that we obtain roughly the same 
error (1%) and log the run time. The parameters for these set-ups are listed in 
Table 5.3 and the seismograms are shown in Fig. 5.14. 

It is worth having a close look at the simulation set-up. The number of grid 
points (and thus the number per wavelength) is three times larger in the case of 
the finite-difference method. For the Fourier simulation the stability criterion € 
was adjusted (leading to a change in the time step) until the error was in the same 
range. 

It is interesting to note (but not necessarily representative) that we end up 
with a quite similar time increment (and thus overall number of simulation time 
steps). The results in Fig. 5.14 indicate that the overall run time (using a simple 
elapsed time routine) is also quite similar. Thus the main difference comes with 
the memory used to obtain this result. Obviously, the memory reduction by using 
the Fourier method is a factor of 3, while more floating point operations per iter- 
ation lead in the end to the same accuracy. For simulations of higher dimensions 
this effect is much more dramatic. 

The actual numbers in this example should not be taken too seriously, but 
the effect on the memory economization was one of the key reasons for the great 
interest in this method at a time when (1) there were no parallel architectures 
other than vector processors, and (2) rapid-access memory was expensive (and 
small). As indicated in the introduction, the global communication property of 
pseudospectral methods and the associated difficulties with efficient paralleliza- 
tion led to a waning interest in those techniques soon after the emergence of 
parallel hardware. 


5.5 Infinite order finite differences 
You might be surprised to find a section on finite differences in this chapter. 
However, after having learned the connection between derivative calculations and 
the Fourier transform, we can take an alternative look at difference operators from 
a spectral point of view. We will find that any finite-difference operation can in fact 
be described as a convolution operation, which implies that the specific finite- 
difference operator has a spectral representation that can be compared with the 
now familiar exact 7k operator. 

We first look at this schematically. The least accurate approximation of a 
centred partial derivative is a two-point operator, as introduced in the previous 
chapter. The accuracy can be improved by increasing the length of the operator, 


for example using ‘Taylor series. On the other hand, in this chapter we have 
learned that by using the Fourier concepts we can calculate basically an exact 
derivative (to machine precision, provided we have frequencies below Nyquist). 
Thus we can consider the two-point finite-difference scheme and the Fourier 
method (v-point) as two ends of an axis describing the length of the difference 
operator, 1 being the length of the vector describing the spatial domain. This is 
illustrated in Fig. 5.15. It is important to note that from a practical/computational 
point of view it makes sense to use ether (1) a low-order spatial scheme, or (2) an 
infinite-order (Fourier) scheme—whereas approaches in between are less optimal 
(see exercises). 

‘To demonstrate why the Fourier method can be interpreted as an infinite-order 
finite-difference scheme let us recall one of the most fundamental (and useful) 
mathematical results; the convolution theorem. Expressed in words, a multiplica- 
tion in the spectral domain is a convolution in the space domain and vice versa. In 
mathematical terms, for two functions d(x) and f(x) with complex spectra D(R) 
and Fk), the convolution theorem says that if 


D(k) 
F(R) 


F [d] 


(5.49) 
F [f] 


then dx f = ¥"[D(®&)F(A), 


where ¥ represents the Fourier transform, and * denotes convolution, defined in 
the continuous case as 


(d * f)(x) =| d(x’ )f (x - x’) dx’ (5.50) 


-00 


and in the discrete case with vectors d;, 7 = 0,1,...,m, and fj, 7 = 0,1,...,7, 


m 


(d*f\x= 0 defi R=0,1,...,m+n. (5.51) 


i=0 


Because of the tremendous practial importance in seismology in particular of the 
discrete convolution in connection with filtering, instrument correction, Green’s 
function analysis, and data processing, I strongly encourage the reader at some 
point to play around with the numerical implementation of these operations with 
software systems such as Python, Matlab, Mathematica, Maple, or others (see 
exercises). Note that the indexing in the previous equation is not unique. Other 
schemes lead to identical results. 

Let us restate the previous result of the partial derivative as an inverse Fourier 
transform defined as 


1 or 

a,f (x) = —= / a,F (k) el dk 
J 20 fore) 

(5.52) 


1 oa 
ee ikF (ke! dk. 
V 2m i. ( ) 
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Fig. 5.15 Finite differences and Fourier 
method as two extreme cases. The two- 
point finite-difference operator 1s the 
least accurate. Increasing the number 
of points contributing to the derivative 
decreases the error. In practical appli- 
cations rarely more than 4-8 points are 
used. Using all points n along a regularly 
spaced dimension allows the exact calcu- 
lation of a derivative at the cost of many 
more calculations. 
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Fig. 5.16 Band limitation of the 
derivative operator. The  discretiza- 


tion with increment dx requires the 
wavenumber domain to be restricted to 
the interval k € [-Rmaxs max], where 
Rinax = 1/dx. Negative frequencies are 
of course unphysical but originate in the 
formulation of the Fourier transform 
using complex numbers. 


kimax max 


Defining the factors in front of the complex amplitude spectrum F(R) of function 
F(x) as 

af (x) = / D(k)F (ke dk, D(k) = ik (5.53) 
we can now interprete this result in connection with the convolution theorem. 


Dk) in general is nothing else but a function defined in the spectral domain acting 
like a filter on the complex spectrum F(&). The convolution theorem implies that 


af (xe) = / . d(x—x)f (x) dx’, (5.54) 


where d(x) is a real function, the spatial representation of spectrum D(A), in other 
words 
d(x) = ¥"[D(k)]. (5.55) 
Are you curious to see, what d(x) looks like? Luckily, this can be derived an- 
alytically in a straightforward way. However, it is not realistic to work with 
infinite integral boundaries. We thus limit the wavenumber domain to the Nyquist 
wavenumber Ring, = 1/dx, knowing that we are operating on regular spaced grids 
with sampling interval dx (see Fig. 5.16). Thus D(k) becomes 
Dk) = tk[H (k a Rinax) ae A (Remax) ]s (5.56) 


where H() denotes the Heaviside function, and to obtain d(x) we simply have to 
inverse transform 


d(x) = F" [ikLH(k + Rmax) — H (R—Remax)]]5 (5.57) 
leading to (see exercises) 
L i 
d(x) = sy [sin RinaxX) a RinaxX COS (RinaxX)] (5.58) 
UX 


in the space time domain. 


It is again instructive to use the convolution theorem to understand this result. 
The right-hand side of Eq. 5.57 inside the Fourier integral is again a multiplica- 
tion of two spectra; one is the derivative operator 7k, and the other is a boxcar 
function expressed in terms of Heaviside functions. The inverse transform of a 
boxcar is a well-known result frequently encountered in filter analysis. Without 
formal demonstration we note that it is a szmc function of the form sin(x)/x. As 
a consequence, the result of Eq. 5.57 is the derivative of a szmc function, and is 
illustrated in Fig. 5.17. 

Up to this point this has applied to the continuous world; so let us see what 
happens if space is discretized according to 


Xn =ndx, n=-N,... ,0,...,N. (5.59) 
In this case the convolution integral becomes a convolution sum 
n=N 
Oxf (x) & Yo df (x-ndx), (5.60) 
n=—-N 


where d,, is the difference operator, in other words, the weight with which the 
function value f(« — ndx) has to be multiplied before summing up to obtain the 
derivative approximation. Note that, even though we derived this result using 
Fourier transformations, this result is general and applies to any operator half- 
length N > 1, including the case of first-derivative staggered-grid calculations 
of second order (N = 1) in which case the derivative would be defined halfway 
between grid points. 

If we insert the discretization into Eq. 5.58 we obtain analytically the dis- 
crete difference operator based on the Fourier transform, now expressed in the 
discretized space, a beautifully simple result: 


0 for n=0 


dn = n 5.61 
CO" for n #0. ( ) 


ndx 


Let us have a closer look at the illustration of this operator in Fig. 5.17. The 
figure shows the analytical solution of the operator d(x) (solid line, the derivative 
of a sinc function) as a reference using a nominal space increment dx = 1. The 
discrete operator is indicated by + marks. 

First, note the anti-symmetry of the analytical (and discrete) result, with the 
consequence that the information at the central point, where the derivative is 
defined, is not influencing the derivative (one of the arguments for the use of 
a staggered-grid approach). Second, note that the standard two-point finite- 
difference operator (in this case [-1, 1]) is included (black dots) in the Fourier 
operator (that, however, extends to 2N + 1 points in total). Third, the Fourier 
derivative operator seems to decay slowly with distance from the central point (in 
fact the decay is proportional to n', indicating that a long operator is required to 
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Grid point 


Fig. 5.17 Exact convolutional deriva- 
tive operator. The band-limited discrete 
central first derivative operator (+) 1s 
compared with the analytical represen- 
tation (solid line). The discrete operator 
d(x) ts obtained by inverse transform- 
ing the complex representation of the 
derivative operator D(k) = ik in the 
Fourier domain, thus d(x) = #7 D(k). 
The band limitation 1s a consequence of 
the sampling theorem with a maximum 
wavenumber of Ring, = 1/dx. 
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Fig. 5.18 Truncated Fourier operator. 
The original exact, discrete Fourier op- 
erator dy, 1s mutipled by a Gaussian 
and limited to N = 6 points on one 
side. Original exact operator (solid line), 
truncated operator (+). 
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Fig. 5.19 Convolutional difference op- 
erators. A standard (Taylor-series-based) 
four-point finite-difference operator (top) 
is compared with a truncated Fourier op- 
erator obtained after multiplication with 
a Gauss function and limitation to N = 
5 points (middle), and the exact discrete 
Fourier operator (bottom). 


obtain high accuracy (see Fichtner, 2010, for a more detailed discussion of such 
operators also in the case of staggered grids and comparison with finite-difference 
operators). 

This raises the question of whether it would be possible to define short- 
difference operators by cutting off the Fourier operator at some appropriate 
distance from the central point. A sensible strategy to do this seems to be to 
multiply this operator with a Gaussian centred at m = 0. Such a procedure is 
illustrated in Fig. 5.18. This approach was pioneered for elastic wave problems 
(to my knowledge) in the Stanford group (Mora, 1986) and later extensively used 
for derivatives and interpolations (Igel et al., 1995) for anisotropic wave propaga- 
tion and waveform inversion problems. Several convolutional difference operators 
are compared in Fig. 5.19. 

How can we conveniently compare the accuracy of such operators? To do this 
we turn around the procedure given above that provided us with the space repre- 
sentation of the exact difference operator D(k) = zk in the wavenumber domain. 
Nothing keeps us from simply Fourier transforming any difference operator into 
the wavenumber domain, which will allow us to compare the result with the exact 
D(&). Thus, for a finite-difference operator d*? we will obtain 

DI?(k) =i ky (k) = F [de] (5.62) 
which—at least intuitively—we expect to be close to the exact solution for low 
wavenumbers (meaning large numbers of grid points per wavelength). Let us 
transform the operators illustrated in Fig. 5.19 to the wavenumber domain and 
compare the imaginary part of the approximate operator spectra with the exact 
solution (i.e, Imag[D(k)]= k). The results are shown in Fig. 5.20. 

We can see that the short-finite-difference operator is accurate for large num- 
bers of grid points per wavelength, but the truncated Fourier operator already 
performs substantially better. 

I find this alternative description in the spectral domain a very powerful con- 
cept to classify explicit numerical approximations to derivatives with the added 
value of providing a means to generate very accurate short-difference operators. 
The curse of the Fourier method is its requirement of periodicity, which is very 
rarely what we need in Earth sciences. How this can be fixed is the topic of the 
next few sections. 


5.6 The Chebyshev pseudospectral method 


Historically, the substantial accuracy improvement of the space derivative 
calculations using the Fourier transform was a major step forward at a time when 
serial (vectorized) hardware architecture was the standard. The major drawback 
was the requirement that the space-dependent functions have periodic behaviour. 
In most cases in Earth sciences we are dealing with limited-area calculations (e.g. 
a reservoir, a volcano, a piece of the Earth’s crust), which requires the accurate 
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implementation of free-surface or absorbing boundary conditions that are very 
hard to achieve when using trigonometric basis functions. 

For those reasons computational seismologists looked for alternative function 
approximations with similar convergence properties to Fourier series. Cheby- 
shev'? polynomials were the natural choice as the preferred solution for non- 
periodic problems with spectral convergence. In the first applications, Chebyshev 
polynomials were used to describe the space-dependent functions in the whole 
physical domain. The required irregular spatial grid creates other problems, how- 
ever, reducing the maximum time step allowed for large models. However, a major 
advantage is the possibility of accurate implementation of boundary conditions. 

There is another reason why Chebyshev polynomials are important for 
the further evolution of computational seismology. Their superior interpola- 
tion properties in combination with an accurate integration scheme led to the 
first Chebyshev-based spectral finite-element implementation, albeit with non- 
diagonal mass matrix formulation. This is why the concept of limited-area 
function interpolation is a key ingredient to some of the methods we use today. 


5.6.1 Chebyshev polynomials 


To introduce Chebyshev polynomials, let us start with the trigonometric relation. 


cos [(7 + 1)¢] + cos [(n— 1) @] = 2 cos(@) cos(n@), neN. (5.63) 


Inserting 1 = 0 leads to a trivial statement. However, for > 1 we obtain 
statements like 

cos(2¢) = 2cos?(¢)-1 

cos(3) = 4cos?(o) —3 cos(@) 

cos(4@) = 8cos*(p) —8cos?(p) + 1 (5.64) 


Fig. 5.20 Difference 
pared in the wavenumber domain. 
The spectra of the truncated difference 
operators are compared with the ex- 


operators com- 


act solution. In principle the vertical 


axis corresponds to the numerical 


wavenumber k as a function of the exact 
wavenumber k. 


1° Born as Pafnuti Lwowitsch Tscheby- 
schow (1821-1894)—his surname would 
be spelled in many different ways—he is 
considered one of the greatest mathemati- 
cians of the nineteenth century. He worked 
in St. Petersburg. His pupils included Mar- 
cov, Voronoi, and Ljapunov. 
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indicating that we can express cos(7@) in terms of polynomials in cos(@). This in 
fact leads to the definition of Chebychev polynomials with 


cos(np) := T,,(cos(@)) = T,,(x) (5.65) 
with 
x = cos() x € [-1,1], née Nos (5.66) 


T,, being the mth order Chebyshev polynomial. The important step here is the 
mapping of x = cos(@) which limits the definition of these polynomials to the 
interval [—1, 1]. Furthermore 


| T,(x)| <1 for [-1,1], née No. (5.67) 


Finally, we can write down the first polynomials in x € [-1, 1]: 


To(x) = 1 

T,\(x) = x 

To(x) = 2x*-1 

T3(x) = 4x3 —3x (5.68) 


Ta(x) = 8x*-8x?-1 


and an illustration of some polynomials is given in Fig. 5.21. There is a recursive 
relation that can be conveniently used to calculate the Chebyshev polynomials of 
any order 7: 


Tri (x) = 2xT, (x) — Tri), n> 1. (5.69) 


0.5 


s 


Fig. 5.21 Chebyshev polynomials T,,(x) 0 02 04 06 0.8 1 
in the interval [0, 1] forn = 2,...,8. x=cos(¢) 
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The extremal values x,” of these polynomials have a very simple form 


k 
x = cos (=) R= 05.1, 25.0557 (5.70) 
n 


and they are shown in Fig. 5.22 for varying n. In fact, these points, and in particu- 
lar this form of irregular grid with densification at the edges, will play an important 
role in the story that unfolds. We anticipate that these extremal points will be 
the Chebyshev collocation points at which the polynomials exactly interpolate an 
arbitrary discrete function. 

With this in mind—assuming that we have a spatial domain discretized with 
these grid points—the linear function shown in Fig. 5.22 (right) has a strong 
message. The fact that this ratio increases with order implies that, according to 
the Courant criterion that we can not escape from, the required time step will 
become very small for increasing order while any spatial field is sampled coarsely 
in the centre of the domain. This behaviour will also be relevant later for spectral 
element methods. 

It can be shown that the Chebyshev polynomials form an orthogonal set with 
respect to the weighting function w(x) = 1//1-x?. This implies that we can 
use them as a basis for function interpolation. By analogy with the discussion 
of Fourier series, we pose the problem of finding an approximation g,(x) to an 
arbitray function f(x) defined in the interval [—1, 1] (a condition that can be easily 
relaxed). In mathematical terms 


1 n 
F(X) © Ble) = Se0T0(x) +) ea Tex), (5.71) 


k=1 


where T;,(x) are the Chebyshev polynomials and c, are real coefficients. The coef- 
ficients c, can be found in the same way as the Fourier coefficients by minimizing 


Fig. 5.22 Extrema of the (even-order) 
Chebyshev polynomnials (collocation 
points) for N=4,...,50. Left: Lo- 
cation of the extrema with increasing 
order. Note the densification of points 
near the boundaries [-1,1] and _ the 
increasing difference between largest 
and smallest separation. Right: Ratio 
between largest and smallest grid point 
distance as a function of order. 
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1! This has practical consequences. It 
implies that the fast Fourier transform can 
be used to calculate the coefficients (and 
derivatives). However, for the purpose of 
simplicity we stick to the approach of us- 
ing derivative matrices for our numerical 
examples. 

2 Careful! In this notation the x coordi- 
nate starts at 1 and ends at -1. Sometimes 
this definition appears with a -sign but this 
affects the definitions that follow. 


the least-squares misfit between original function f(x) and approximation g,,(x) 
to obtain 
2 d 
ha 
a= 5 [FORO TS B= 0; Lyntes% (5.72) 


which—after substituting x = cos(¢)—can be written as 
1 u 
Ch = — [ F00sc@) cos(kd) do B= 05 esi Kh: (5.73) 
A 


These coefficients turn out to be the Fourier coefficients for the even 27 -periodic 
function f(cos(#)) with x = cos(¢).'! 

Is there a set of points on which Chebyshev polynomials interpolate exactly as 
was the case with the discrete Fourier transform? Well, as stated before, the answer 
is yes, and we thus can write down a corresponding discrete Chebyshev transform 
with similar properties to the Fourier transform but without the requirement of 
periodicity. The points we need are the extrema of the Chebyshev polynomials 
the (Chebyshev-) Gauss-Lobatto points defined as!” 


UT é 
x; = cos (=i) 1 = 05.15.0005 (5.74) 


With these unevenly distributed grid points we can define the discrete Chebyshev 
transform as follows. The approximating function is 


n-1 


gt (x) = 5% Ty + d ck T(x) + xn T, (5.75) 


with the coefficients defined as 


wx 
| 


= 2 | A yey + cry¥yen + 5 yfeos (=) 
m|2 = i m 


(5.76) 


R= Os 1 ies 5 Ns n=m. 


Here, f(1) and f(-1) are the function values at the interval boundaries and /; are 
the values at the collocation points f(x = cos(z/m)). With these definitions we 
recover the fundamental property 


& (xi) = f(x), (5.77) 


where x; are the collocation points with the implication that we exactly (to 
machine precision) recover the original function. These equations should be com- 
pared with those of the discrete Fourier transform (Eq. 5.16—-Eq. 5.20) as an 
equivalent formulation for limited functions. 
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Let us again see how this works in practice and approximate a simple function 
like f(x) = x? in the interval [-1, 1] using the Chebyshev transform. The results 
are shown in Fig. 5.23 (left), indicating that a non-periodic function like f(x): 
(1) can be exactly interpolated at the collocation points, and (2) converges very 
rapidly, with just a few polynomials. This raises the question of how the Cheby- 
shev transform deals with discontinuous behaviour inside the domain. This is 
illustrated in the second example of Fig. 5.23 (right). We note that, despite the 
exact interpolation, convergence of the interpolating function is slow and an over- 
shoot occurs, just as with the Fourier method. The consequence is that similar 
smoothness constraints have to be imposed when using these techniques for the 
solution of wave equations. 

Finally, it is instructive again to demonstrate the interpolation behaviour 
showing the corresponding cardinal functions of Chebyshev polynomials. They 
are obtained by finding the interpolating function for spikes of the form fj; = 
(0,1,...,0) at any point on the grid. Two examples are shown in Fig. 5.24. In 


Cardinal function 


Fig. 5.23 Discrete interpolation with 
Chebyshev polynomials. Left: Interpo- 
lation of the function f(x) = x? in the 
interval [-1, 1] on the Chebyshev collo- 
cation points. Right: Interpolation of a 
Heaviside function with discrete Cheby- 
shev polynomials. Note the occurence of 
an overshoot as we observed with Fourier 
series (the Gibbs phenomenon). 


Fig. 5.24 Cardinal with 
Chebyshev polynomials. Two examples 


function 


of cardinal functions for grid points 
1 = 2 (dashed line) and 1 = 6 (solid line) 
are shown forn = 8. 
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13. Tt is worth noting that the cardi- 
nal functions may exceed 1 in this case. 
Later we will use Lagrange polynomials 
(spectral-element method) that do not ex- 
ceed 1 in the corresponding interval. This 
has advantages (see Fichtner, 2010 for an 
in-depth discussion). 


the case of the Fourier method they were of identical form and only translated for 
each grid point. Here they are different as a result of the uneven spacing.!* 

For the numerical solution of the wave equation we need derivatives. How to 
obtain derivatives with the Chebyshev transform is discussed in the next section. 


5.6.2 Chebyshev derivatives, differentiation 
matrices 


It is well known that a convolution operation as introduced in the section on 
Fourier derivatives can be formulated as a matrix—vector product involving 
so-called Toeplitz matrices. An elegant (but inefficient) way of performing a 
derivative operation on a space-dependent function described on the Chebyshev 
collocation points is by defining a derivative matrix D,; 


72 = : 
-2N! for i =j=N 


D; = -4 “i, for i =j = 1,2,...,N-1 (5.78) 


y 1-x? 
4D" for i #j = 0,1,...N 
ds wy 
where N + 1 is the number of Chebyshev collocation points x; = cos(iz/N), 
2=0,...,;N, Doo = —Dnn; and the c; are given as 


2 fori=0orN 


5.79 
1 otherwise. ( ) 


Gt 


This differentiation matrix allows us to write the derivative of function wu; = u(x;) 
(possibly depending on time) simply as 


Oy; = D; Uj» (5.80) 


where the right-hand side is a matrix—vector product, and the Einstein summation 
convention applies. Following the lengthy foreword about Chebyshev polynomi- 
als, we finally arrive at an exact (polynomial-based) derivative at the Chebyshev 
collocation points. 

Before we proceed to demonstrate its behaviour, let us have a closer look at the 
concept of differentiation matrices. In the preceding sections we demonstrated 
that any finite-difference-type calculation can be expressed as a discrete con- 
volution. In that respect, any convolution of two vectors can be expressed as a 
matrix—vector product. This is well known and frequently used in filter theory. 
Finding these matrices involves the concept of Toeplitz matrices. Thus, if we can 
write down the convolutional operator for a differentiation we can also determine 
a differentiation matrix. In practice this is rarely used in computer programs as the 
operation scales with n?, with n being the length of the original vector. However, 
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again this is a powerful concept for comparing the structure and properties of the 
various approaches. 

In Figure 5.25 a number of differentiation matrices are shown, including the 
Chebyshev case. Both exact cases (Chebyshev and Fourier) consist of full ma- 
trices. The finite-difference operators are banded matrices with the band-width 
equal to the length of the operator. It is noteworthy that, for the same number 
of points, the Chebyshev weights increase drastically towards the edges. This 
is related to the decreasing grid point distances at the interval boundaries. An- 
other interesting point is that the Chebyshev differentiation matrix consists of the 
standard two-point operator for N = 1. 

Let us proceed by testing the differentiation using Chebyshev polynomials. We 
define a function akin to seismic wavefield calculations as 

F(x) = sin(2x;) — sin(3x;) + sin(4x;) — sin(10x;) (5.81) 
in the interval x; € [-1,1], where the discrete points are the Chebyshev colloca- 
tion points x; = cos(zi/n), 1 = 0,...,n. Following the approach shown earlier 
the derivative of the function f(x;) is simply obtained by performing the matrix— 
vector product presented in Eq. 5.80 to the vector containing the function values. 
The results for 7 = 63 (which implies 7 + 1 points including the boundaries) are 
shown in Fig. 5.26. We expect a (close-to) exact derivative, and this is what we 
obtain if we look at the error of the derivative (compared to the analytical solu- 
tion). Numerically the error behaves slightly different from the Fourier method 


Fig. 5.25 Illustration of differentiation 
64). Top left: Exact 
Fourier differentiation matrix for regular 
grid (full). Top right: Tapered Fourier 
operator (12-point). Matrix 1s banded. 
For illustration purposes the square root 


matrices (n = 


of the absolute values is shown. Bottom 
Left: Standard 2-point finite differ- 
ence operator (banded). Bottom Right: 
Exact Chebyshev differentiation matrix 
for Chebyshev collocation points. Note 
that the matrix ts full. Increasing weights 
at the corners dwarf interior values. 
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Fig. 5.26 Numerical derivative using 
the Chebyshev differentiation matrix. 
Top: Discrete analytical function (see 
text) defined on the Chebyshev collo- 
cation points (n= 63). Bottom: Su- 
perposition of exact derivative and nu- 
merical derivative based on matrix— 
vector multiplication (indistinguishable 
solid lines). The difference between nu- 
merical derivative and analytical solu- 
tion is given by the dotted line after 
multplication by 101". 


related to the irregular sampling intervals. The round-off error increases at the 
boundaries and also depends on the overall number of grid points used in the 
entire interval. 

We can now assemble these tools to come up with another solution to our 1D 
elastic wave equation, using Chebyshev polynomials. 


5.6.3 Elastic waves in 1D 


Let us state the elastic 1D wave equation for the unknown discrete displacement 
field «/, right away as an extrapolation problem of the form 


uf" - 2u + ul 


Pia = (8 (u(x) due, OW! +f! 


(5.82) 


using the standard three-point operator for the time-dependent part. Again the 
lower index 7 corresponds to the spatial discretization and the upper index j 
to the discrete time levels. The main difference with any previous methods is 
that the displacement field, and the geophysical parameters like density p; and 
shear modulus j1;, are defined on the irregular Chebyshev collocation points. The 
parameters used in our example simulation are given in Table 5.4. 

The fact that we use the Chebyshev collocation points with densification near 
the interval boundaries has (dramatic) consequences for the simulation set-up. 
As we can see from the parameters, the distance between the grid points is 80 
times smaller at the boundaries compared to the centre of the physical domain. 
The time step for a stable simulation, according to the Courant criterion, requires 
cdt/dx < €, where c is the maximum velocity, dt is the time step, dx is the min- 
imum(!) grid interval, and € is some value close to 1. That means that the grid 
distance near the boundary is responsible for the global simulation time step. 

(-1, 1] and thus the 
simulation would correspond to a 2 m block with wave propagation in the kHz 
range. The results of the simulation are shown in Fig. 5.27. The figure shows two 


In our example we stick to the interval boundaries x; € 


snapshots of the propagating phase (the first derivative of a Gaussian) near the 
centre of the domain (sampled with ~ 10 points) and near the edge (sampled with 
=~ 35 points). In the graphical representation the discrete displacement values are 
interpolated using the Chebyshev transform for illustration purposes. 

The following code snippet shows the implementation of the Chebyshev 
method with Python. We assume that the differentiation matrix D has been ini- 
tialized and current values of the displacement field are stored in vector u. The 
derivatives are obtained by the implicit matrix—vector multiplication requiring the 
vectors to be transposed (e.g. u.7). The boundary points at the edges [-1,1] are 
initialized to O at each time step. 


# Time extrapolation 
for it in range(nt): 


# Space derivatives 
du =D @u.T 

du = mu/rho * du 
du = D @ du 

# Extrapolation 


unew = 2 * u - uold + du.T * dt**2 

# Source injection 

unew = unew + gauss/rho * src[it] * dt**2 
# Remap displacements 

uold, u = u, unew 


Looking at our simulation example we can identify a major disadvantage of 
the Chebyshev method. While we benefit from the fact that we are working in a 
limited domain, the price is high. To obtain a stable solution we need a very small 
time step, which is basically only needed at the boundaries. In fact, mathematically 
the time step scales with O(N~*). This implies that inside the domain we hugely 
oversample the wavefield in time. 

However, there is (at least to some extent) a fix to that problem. In principle we 
can (re-) stretch the spatial grid such that the grid points close to the boundaries 
are further apart while the grid point distances at the centre remain basically 
unchanged. If that stretching function is &(x) then the derivative of a function 
f(x) on the stretched grid is defined as 


ag f(x) = of aS (5.83) 


This is a trivial additional operation if the coordinate change is an analytical func- 
tion, which is the case. The procedure is explained for example in Carcione and 
Wang (1993) and is an essential ingredient in all Chebyshev methods that are 
used for scientific applications. 

This concludes the presentation of the pseudospectral method based on 
limited-area calculations using Chebyshev polynomials. Spectral interpolation 
properties in limited areas—possible also with Lagrange polynomials—will reap- 
pear when we discuss the spectral element method. 


5.7 The road to 3D 


The extension of pseudospectral methods to 3D is straightforward. Consider the 
acoustic equation using the Fourier method with 


Pp =" (AZp+dpt dzp) +s (5.84) 
and 


OP, =F" [-k P*). (5.85) 
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Table 5.4 Simulation parameters for 
1D elastic simulation with the Cheby- 


shev method 


Parameter 


Value 


200 

3,000 m/s 
2,500 kg/m? 
6 x10% 5 
1.2 x10 m 
0.015 m 
100 kHz 

1.4 
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Fig. 5.27 Simulation of the 1D elas- 
tic wave equation with the Chebyshev 
method: snapshots of the propagating 
phase. The discrete values of the nu- 
merical simulation are indicated by dots. 
The solid line is the (exact) interpola- 
tion function obtained by the Chebyshev 
transform. Top: Wavefield near the cen- 
tre of the domain. Bottom: Wavefield 
near the boundary. Note the drastic dif- 
ference of the wavefield sampling in both 
cases. 


Here, P” corresponds to the Fourier transform of p with respect to x. Derivatives 
with respect to other coordinates can be calculated accordingly. Furthermore, 7k 
denote space increments and n the time level. With p; > pj, we simply add two 
more dimension to the space-dependent fields and calculate the derivatives using 
the same routines as in 1D. The time extrapolation remains unchanged. 

The same applies to the Chebyshev method in matrix form, where Eq. 5.85 
would be replaced by 


Din = DinP nies (5.86) 


where Dz, is a differentiation matrix for the second derivative and the summation 
convention applies. 

However, note that pure 3D pseudospectral schemes only make sense in a 
serial computing environment, because of the global communication schemes. 
Furumura et al. (19985) presented a parallel Fourier scheme for 3D using a 
clever partitioning. Attempts have been made to combine pseudospectral schemes 
with finite differences (e.g. using pseudospectral derivatives in horizontal direc- 
tions and finite differences in the vertical direction). Furumura et al. (2002) 
presents examples for ground motion modelling using such a hybrid scheme. Fu- 
rumura and collaborators also applied the pseudospectral method to global wave 
propagation in the 2.5D axi-symmetric approximation (Furumura et al., 19984; 
Wang et al., 2001) with several interesting applications modelling heterogeneous 
structures (e.g. Furumura et al., 1999; Furumura and Kennett, 2005). 

An excellent description of a 3D Chebyshev implementation of anisotropic 
wave propagation with absorbing and free-surface boundary conditions is given 
in Tessmer (1995). Igel (1999) applied this scheme to regional-scale wave prop- 
agation in spherical coordinates. Recent developments include applications to 
strongly heterogeneous media using a poly-grid Chebyshev method (Seriani and 
Su, 2012). 


Chapter summary 


e Pseudospectral methods are based on discrete function approximations 
that allow exact interpolation at so-called collocation points. The most 
prominent examples are the Fourier method based on trigonometric basis 
functions and the Chebyshev method based on Chebyshev polynomials. 


e The Fourier method implicitly assumes periodic behaviour. Boundary 
conditions like the free surface or absorbing behaviour are difficult to 
implement. 


e The Chebyshev method is based on the description of spatial fields using 
Chebyshev polynomials defined in the interval [—1, 1] (easily generalized to 
arbitrary domain sizes). Exact interpolation (and derivatives) are possible 
when the discrete fields are defined at the Chebyshev collocation points 
given by x; = cos(m1/n), 1=0,...,7. 
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e Because of the grid densification at the boundaries when using Chebyshev 
collocation points, very small time steps are required for stable simulations 
for increasing number of grid points. This can be avoided by stretching 
the grids near the boundaries by a coordinate transformation. 

e A major advantage of the Chebyshev method is an elegant formulation of 


boundary conditions (free-surface or absorbing) through the definition of 
so-called characteristic variables. 


e Pseudospectral methods have isotropic errors. Therefore they lend them- 
selves to the study of physical anisotropy. 


e The derivative operations of pseudospectral methods are of a global 
nature. That means every point on a spatial grid contributes to the 
gradient. While this is the basis for the high precision, it creates problems 
when implementing pseudospectral algorithms on parallel computers with 
distributed memory architectures. As communication is usually the bot- 
tleneck, efficient and scalable parallelization of pseudospectral methods is 
difficult. 


FURTHER READING 


e Fornberg (1996) is an excellent book illustrating the properties of the 
pseudospectral method in comparison with finite-difference methods. 


e Bracewell (1999) is still a standard and very readable textbook on the 
Fourier transform. There is also an interesting section on the life of Joseph 
Fourier. 


e Trefethen (2015) is a great book on spectral methods built around Matlab 
programs. 


e Stein and Shakarchi (2003) is the first of a more recent three-volume 
account of Fourier analysis and its applications. 


e Fichtner (2010) provides a detailed discussion of Fourier-derived differ- 
ence operators on staggered and central grids. 


EXERCISES 
Comprehension questions 


(5.1) Explain the concept of exact interpolation behaviour in the context of 
pseudospectral methods. What are cardinal functions? 
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(5.2) 


(5.3) 


(5.4) 


(5.5) 


(5.6) 


(5.7) 


(5.8) 


(5.9) 


Explain the meaning of the term pseudospectral. What is so spectral about 
the pseudospectral method? 

Through which concepts can the exact interpolation or derivative be 
achieved? What is the price for this accuracy? 

Motivate the use of function approximations (e.g. Fourier series, Cheby- 
shev polynomials) for Earth science problems. 

What are the main differences between Fourier and Chebyshev ap- 
proaches? Give application examples. 

Discuss pros and cons of the Fourier method compared with the finite- 
difference method. Give examples where you would prefer one over the 
other. What is the role of computer architecture? 

Are there fundamental differences between the numerical dispersion 
behaviour of the finite-difference and pseudospectral methods? If so, 
why? 

What is the meaning of the convolution theorem and what is its signifi- 
cance concerning numerical differentiation? 

The pseudospectral method appears simple, elegant, and very accu- 
rate. Why is it not the preferred method of choice today? Could the 
pseudospectral concept in 2(3)D be combined with the finite-difference 
method (for space derivatives) ? 


Theoretical problems 


(5.10) 


(5.11) 


(5.12) 


How is the orthogonality of functions defined? Show the orthogonality of 
sin(nx) for n > 0 evaluating 


i; : sin(Gx) sin(kx) dx. 


cM 


The Fourier coefficients for an odd function can be obtained by 


Do fh . (NX 
m= f f(x) sin (=) ax. 


What is the meaning of the Fourier coefficients? Calculate the coefficients 
n= 1,2,... for f(x) = x and L = 1 and plot the approximate function 
using 


N 
en (x) = D>, sin =. (5.87) 


n=1 


Derive the Fourier series for f(x) = x in the interval x € [0,27] to 
recover Eq. 5.15. 


(5.13) 


(5.14) 


(5.15) 


(5.16) 


(5.17) 


In general, the spectrum F(k) of the derivative of a function f(x) is 
given by 


1 fe 
F(k) = ee / foe dx. 


Use integration by parts to show that (only by) assuming f(x) vanishes if 
x —> +00 we obtain the extremely useful result that F (k) = (tk) F(R) 
is the spectrum of the mth derivative of f(x). 

The fact that we discretize space with dx implies that our wavenumber 
space is limited by the Nyquist wavenumber Aya, = m/dx. Derive the 
analytical form of the difference operator d(x) by an inverse transform 


Rmax : 
d(x) = / (ik) e"** dk. 


Hint: Use integration by parts. You need to recover Eq. 5.58. 
Use the concept of the previous exercise to derive the exact interpolation 
operator 


Rmax . 
d(x) = / ek. 
—kmax 

Derive the dispersion relation for the Fourier pseudospectral approxi- 
mation of the 1D acoustic wave equation applying the von Neumann 
analysis. Hint: Start with the wave equation and insert discrete plane wave 
trial functions. You want to recover Eq. 5.38. 
Use the definition 


cos [(n + 1)@] + cos [(m- 1) 4] 


(5.88) 
= 2 cos(#) cos(nd) 


to recover the first five Chebyshev polynomials 7;,(cos(@)) = cos(m@) = 
T,(«*),n=1...5 with x = cos(@). 


Programming exercises 


(5.18) 


(5.19) 


For the following exercises you can make use of the codes in the 
supplementary electronic material. 


Define an arbitrary function (e.g. a Gaussian) and initialize its deriva- 
tive on the same regular spatial grid. Calculate the numerical derivative 
using the Fourier method and the difference to the analytical derivative. 
Vary the wavenumber content of the analytical function. Does it make a 
difference? Is the derivative always exact to machine precision? 
Calculate a program that initializes the Chebyshev differentiation matrix 
and perform the same task as in the previous exercise. Note that you need 
to use the Chebyshev collocation points for the spatial grids. Increase the 
number of grid points and discuss the difference of grid-point distance at 
the centre and the boundary of the physical domain. 
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(5.20) 


(5.21) 


5.22) 


5.23) 


5.24) 


(5.25) 


Code a Fourier pseudospectral approximation to the 1D acoustic wave 
equation from scratch and compare with the analytical solution. Use 
parameters given in the examples. 

Code a Chebyshev pseudospectral approximation to the 1D acoustic 
wave equation from scratch and compare with the analytical solution. 
Calculate the derivatives using matrix—vector multiplication as discussed 
in the text. 

Determine numerically the stability limit of the Fourier (and/or Cheby- 
shev) method applied to the 1D (2D) acoustic wave equation by varying 
the Courant criterion. 

Implement a positive velocity discontinuity of 50% at the centre of the 
1D domain. Observe the reflection as a function of dominant wavelength 
(i.e. change the dominant frequency of the source wavelet). 

Keep the physical and numerical parameters of the Fourier simulation 
constant and vary the number of grid points nx. Add a statement in the 
code that measures the run time (it makes sense to only log the time 
extrapolation loop). Plot the run time as a function of mx keeping the size 
of the physical domain constant. Do the same with the Chebyshev code. 
Compare the required time step as a function of mx. 

Code the analytical solution to the acoustic wave problem in 1D. Com- 
pare the numerical result of the Fourier method with the analytical 
solution in an appropriate frequency band. Do the same using the basic 
1D finite-difference code. Fix the accuracy of the final solution keeping 
the dominant frequency and propagation distance the same (e.g. 5%). 
Find numerical parameters for the Fourier method and finite-difference 
method that lead to the defined accuracy. Compare and discuss the com- 
putational set-ups in terms of memory requirements, number of time 
steps, Courant criterion, and computation time (compare with Fig. 5.14). 


The Finite-Element Method 


The numerical methods encountered so far (the finite-difference and pseu- 
dospectral methods) were based on purely mathematical concepts. The problem 
posed was how one can deal with space or time derivatives in partial differential 
equations when both space and time are discretized on (regular) grids. The finite- 
element method is a numerical approach that first originated in solid mechanics 
and structural engineering and was later put on solid mathematical foundations, 
in part with concepts that were developed in the nineteenth century. The naming 
of some of the elementary ingredients of the numerical method (e.g. stiffness and 
mass matrices) indicate the origin in mechanics. 

What are the basic principles? We always have to bring the problems described 
with continuous partial differential equations into some discrete form, in order to 
allow realistic problems to be solved. The engineering approach was to subdivide 
a mechanical structure into beams (1.e. elements) that are behaving in an identical 
physical way, to link them at appropriate points (e.g. element corners), and finally 
to join all of them into a complete system (assembly). Obviously, in engineering, 
mechanical structures are usually geometrically complicated, which is why this 
requirement was the point of departure for the finite-element method, rather than 
the exception. Other techniques (e.g. the finite-difference and pseudospectral 
methods) were originally introduced for regular grids and later tweaked towards 
complex geometries, with more or less success. 

In terms of physics and governing equations, the path from structural engi- 
neering to seismology is a very short one. So it was natural that seismologists 
sought to apply the finite-element method to seismic wave propagation. Today, 
finite-element-type methods are particularly used for problems where geomet- 
rically complex structures like surface topography or internal structures like 
sedimentary layers have to be included (see Fig. 6.1). The finite-element method 
has been tremendously successful in the engineering world. Many commercial 
and non-commercial finite-element frameworks are available, which can be used 
to solve a huge class of problems. 

The book series on the finite-element method by O. C. Zienkiewicz and co- 
authors (Zienkiewicz and ‘Taylor, 1989; Zienkiewicz et al., 2013), which has 
been running for half a century, is probably the most successful literature on 
any numerical method. The books have an awesome clarity and I highly recom- 
mend them for readers especially interested in this method. They characterize the 
finite-element method as a general discretization procedure of continuum mechan- 
ics problems posed by mathematically defined statements.! Also, they note that the 


Computational Seismology. First Edition. Heiner Igel. 
© Heiner Igel 2017. Published in 2017 by Oxford University Press. 
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1 And further define two major steps: 
(1) The continuum is divided into a fi- 
nite number of parts (elements), the be- 
haviour of which is specified by a finite 
number of parameters; and (2) the solu- 
tion of the complete system as an assembly 
of its elements follows precisely the same 
rules as those applicable to standard discrete 
problems. 
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Fig. 6.1 Tetrahedral finite-element 
mesh for the seismic velocity structure 
of the Grenoble basin. The various 
domains (sedimentary basin, bedrock) 
are meshed separately and matched. 
With this approach the geometry of 
internal interfaces, topography, and/or 
fault surfaces can be honoured. Figure 
courtesy of M. Kaiser. 


finite-element method is a unique approach which contains other methods (like 
finite-differences) as special cases. 

It is in this spirit that the finite-element method will be presented in this chap- 
ter, focusing on the mathematical steps that lead to the solution system. However, 
the book on Basis and Fundamentals of the Finite-Element Method (Zienkiewicz 
et al., 2013) has 700 pages! Therefore it is clear that we can only scratch the 
surface here. Suggestions for further reading are given at the end of the chapter. 


6.1 History 


We will focus on applications of the finite-element method to problems in seis- 
mology. As the free-surface boundary condition is implicitly fulfilled, the method 
lends itself to the study of surface wave propagation (Lysmer and Drake, 1972; 
Schlue, 1979). Seismic scattering problems were simulated with the method pre- 
sented in the dissertation of Day (1977). Its use was also investigated for problems 
in exploration seismics (Marfurt, 1984). It was found at the time that the method 
was more compute-intensive than other low-order methods such as finite differ- 
ences. Also, in addition to the physical propagation modes, parasitic modes for 
high-order implementations were found (Kelly and Marfurt, 1990). Later, Seron 
et al. (1990) further developed the low-order finite-element methods, making 
them more efficient for seismic exploration problems. 

At the same time the mathematical and engineering foundations of finite ele- 
ments were further developed in the classic books by Strang (1988), Zienkiewicz 
and ‘Taylor (1989), and Zienkiewicz et al. (2013). Parallel implementations on 
the legendary CM-2 massively parallel supercomputer were presented by Li et al. 
(1994). Finite-element principles are also the basis for the so-called direct solution 
method (DSM), which was introduced by R. Geller and co-workers (Cummins 
et al., 1994a; Cummins et al., 19946; Cummins et al., 1997). 

Many applications of the finite-element method to problems in seismic shaking 
hazards and engineering seismology were conducted by the group of J. Bielak and 
co-workers. Examples include modelling of ground motion simulation (Bielak 
et al., 1998), with specific applications to the response in alluvial valleys, e.g. 
Bielak and Xu (1999). The methods were later extended to the problem of full 
waveform inversion (Askan and Bielak, 2008; Epanomeritakis et al., 2008). 

Hybrid methods that make use of advantages of both finite-difference and 
finite-element methods (e.g. for the implementation of free-surface boundary 
conditions) were presented by Moczo et al. (20106). 

To my knowledge, the classic low-order finite-element method was not at first 
as widely used by the seismological community (or in seismic exploration) as 
finite-difference methods. This is most likely related to the more simple math- 
ematical concepts underlying finite-difference methods and the ease with which 
algorithms could be adapted to specific problems. Another point is the fact that 
finite-difference solutions are more easily implemented on parallel computers. 


Low-order finite-element methods require the solution of an often very large 
linear system of equations with global communication requirements. 

It took developments that started in the eighties, leading to high-order vari- 
ations of the finite-element method using Lagrange polynomials or Chebyshev 
polynomials as basis functions (termed spectral elements), that initiated the success 
of this approach as we know it today (see next chapter). 

In my view the classic low-order finite-element method deserves its place as 
a separate chapter in this volume despite the fact that the spectral element ap- 
proach in the following chapter is a straightforward extension. This allows the 
introduction of the fundamental concepts of Galerkin-type methods at a more 
fundamental level, starting with the static problem well known from engineering 
applications. 


6.2 Finite elements in a nutshell 


We start by stating the 1D elastic wave equation with space-dependent density p, 
shear modulus jy, and forcing term /: 


pd7u = dy W dut f. (6.1) 


We seek solutions to the displacement field w(x, t). A fundamental contrast with 
the finite-difference method is that we do not solve for the displacement field wu di- 
rectly but instead replace it by a finite sum over (initially linear) basis functions @;: 


N 
u(x) © U(x) = D> ui(gi(x). (6.2) 


i=1 


Our unknowns are the coefficients of the basis functions g;, which we term uw; as 
they actually correspond to the discrete displacement values at node points x;. 

Furthermore, we formulate a so-called weak form of the wave equation, multi- 
plying the original strong form by a test function g; with the same basis (this is the 
Galerkin principle), followed by an integration over the entire physical domain. 
This leads to a linear system of equations for domain D of the form 


[orig are | wana ae= f fod, (6.3) 
D D D 


where we seek to find the approximate displacement field 7 for given model pa- 
rameters and forcing. With appropriate definitions of the vectors and matrices 
of this linear system, we can replace the time derivative of the solution field 
with finite differences and formulate an extrapolation problem as with all other 
numerical methods discussed in this volume. 

Given appropriate initial conditions (everything is at rest, u(t = 0) = 0) the 
solution at the next time step u(¢+dz) can be found by the following matrix—vector 
equation: 


Finite elements in a nutshell 


155 


156 = The Finite-Element Method 
u(x) = Yo ui(t)yi(x) 

a 
Vv 
& 
Vv 
3 
eC 
a 

5,100 5,120 5,140 5,160 
a 
(3) 
& 
(3) 
3 
2S 
Qa 

0) 5,000 10,000 

x (m) 


Fig. 6.2 Finite element method in a 
nutshell. Bottom: Snapshot of a finite- 
element simulation of 1D elastic wave 
propagation in a medium with three dif- 
ferent velocities (grey shading). Top: 
Detail at the centre of the domain. The 
vertical lines indicate the element bound- 
aries and the + signs at the locations 
at which the displacement values are 
evaluated. Inside the elements the dis- 
placement field is described by a linear 
function. The solution field u(x) is ap- 
proximated by a sum over basis functions 
gi appropriately weighted. 


? Because of its potentially huge size and 
sparsity, it is hardly ever initialized as a 
matrix in realistic applications. 


u(t + dt) = dt?(M")" [f-K*u] + 2u@ -u(t- da), (6.4) 
where M and K are mass and stiffness matrices, respectively. 

One of the most important aspects of the finite-element method is the fact that 
these are global matrices in the sense that if a physical domain is discretized with 
N points, then the matrix shape is N x N. Note also that one of the matrices 
has to be inverted.” In general, the matrix inversion has to be performed using 
sophisticated solution strategies, depending on the specific problem, grid type, 
and spatial dimensions. An example of a 1D finite-element simulation is shown 
in Fig. 6.2. 

In our example the mass matrix M consists of elements of the form /; p Pvigjdx 
and the stiffness matrix K is built up with elements of the form f, uVg;Vg;dx. 
These integrals can be computed in an elegant way for each element by mapping 
the physical space to a local reference space. If the parameters jz and p are con- 
stant inside the elements, these integrals can be computed analytically. The most 
attractive feature of the finite-element method is the fact that it can be formulated 
for arbitrary element shapes. This allows the simulation of models with highly 
complex geometric features and is the reason why this method is so successful for 
engineering problems. 

In seismology, wide use of the finite-element method was hampered by the 
necessity to solve a huge system of equations. This situation can be substan- 
tially improved by a specific choice of basis functions and a numerical integration 
scheme, leading to the spectral-element method discussed in the next chapter. 


6.3 Static elasticity 


‘To make the introduction to the finite-element method as easy as possible, we start 
with the case of static elasticity. Departing from the 1D elastic wave equation 
p(x) A; U(x, f) = p(x) Ocu(x, t) + f(x, 0), (6.5) 
we assume that the displacement does not depend on time (07u(x, 2) = 0). Here, 
ju is the shear modulus, w is the displacement field, and f is external forcing. 
Furthermore, the elastic properties of our 1D medium (e.g. a string or a bar) 
are independent of space (homogeneous, u(x) = const.), so that we arrive at the 
following differential equation: 
—paeu =f. (6.6) 
This has the mathematical form of a Poisson equation Gf we assume pp = 1 or 
bring it to the right-hand side). The problem we are solving here corresponds to 


the question of how much a string is displaced if you pull with a certain force (see 
Fig. 6.3). 


In the following we proceed with the mathematical steps that lead to the classic 
discrete finite-element solution scheme. The first step is to transform the strong 
form of the differential equation into the weak form by multiplying the equation 
with an arbitrary space-dependent (real, well-behaved) test function that we de- 
note as v > v(x). Then we integrate the equation on both sides over the entire 
physical domain D with x € D and obtain (omitting space dependencies) 


- f wa uvax= | fudx. 
D D 


Note that we have not changed the solution to this equation. Our unknown field is 


(6.7) 


u(x) given constant jz and our choice of forcing f(x). We carry out an integration 
by parts of the left side of the previous equation: 


(6.8) 


Xmin 


 f wat uv ax = [ved ud, de (wou ole 
D D 


where the last term is an anti-derivative. 

The next step and the argument involved is of fundamental importance and 
leads to one of the most attractive features of finite- (or spectral-) element meth- 
ods. Remember that a free-surface condition implies that the stress vanishes at 
some boundaries of our physical domain. In 1D the stress is given as 0 = wd,u. As 
the anti-derivative is evaluated at the domain boundaries, this implies—given free- 
surface conditions—that this term vanishes. From a computational point of view 
this means we get the free-surface boundary conditions for free (it is implicitly 
solved correctly). For several other methods (like the finite-difference method and 
the pseudospectral method) the correct implementation of the free-surface condi- 
tion with the same accuracy as the wavefield inside the domain is a big headache. 
Thus, assuming we have a free surface at the edges of our domain D, we obtain 
the weak form as 


wf auaode= f fod, (6.9) 
D D 


which is still a continuous description. 
To enter the discrete world we replace our exact solution u(x) by 7, a sum over 
some basis functions g; that we do not yet specify:? 


P 
u(x) = >> ugi. 


i=1 


(6.10) 


If our original weak equation holds for u we also expect it to hold for 7, and 
replacing wu we obtain 


wf amavd=f pode (6.11) 
D D 


The next step is another fundamental concept of numerical analysis, known as 
the Galerkin method or principle. As a choice for our test function v(x) we use 
the same set of basis functions. Thus v > ;(x). 
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Fig. 6.3 Static elasticity. A string with 
homogeneous properties (density and 
shear modulus) ts pulled with a certain 
force. The Poisson equation determines 
the displacement of the string given ap- 
propriate boundary conditions. Don’t 
overdo this experiment, particularly if 
you have old strings. 


3 By developing the weak form of the 
wave equation we are reducing the di- 
mensionality of our original problem from 
infinite- to finite-dimensional, solving the 
differential equation in a subspace using lin- 
ear algebra. The most important property 
of the Galerkin method is the fact that the 
error is orthogonal to this subspace. 
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Fig. 6.4 Linear basis functions for the 
finite-element method. A 1D domain is 
discretized with n— 1 elements having 


n = 10 element boundaries (open cir- 
cles). The basis functions g; = 1 at 
x = x; With this basis an arbitrary 


function can be exactly interpolated at 
the element boundary points x;. For the 
finite-element system of equations, inte- 
grals over the basis functions and their 
derivatives have to be evaluated sepa- 
rately for each element. 


Basis function 7 


Element boundaries 


What is the simplest choice for our basis functions g;? Denoting x;,7 = 
1,2,...,.N as the boundaries of our elements we define our basis functions such 
that gy; = 1 at x = x; and zero elsewhere. Inside the elements our solution field is 
described by a linear function. In mathematical terms this is obtained by 


SEL for x1 <x < x; 


Xi-Xj1 

gi(x) = 4 =B for x; < x < X41 (6.12) 
i+1-X7 
0 elsewhere, 


but more easily grasped by looking at Fig. 6.4. Recall the definition of our ap- 
proximate solution field u * u(x) = . uig;. It immediately becomes clear why 
we denoted the coefficients of the basis functions as u;. Because of the unit value 
of basis function g; at x = x; the coefficients are the solutions of the displacement 
field, defined at the element boundaries. Inside the finite elements the solution 
field is interpolated by a linear function. It is important to note (and this is a 
major difference to finite-volume or discontinuous Galerkin methods) that adja- 
cent elements share the same value at the boundaries. This finally leads to the 
requirement to solve a large global linear system of equations. 

We are ready to assemble our discrete version of the weak form by replacing 
the continuous displacement field by its approximation and applying the Galerkin 
principle. Putting Eq. 6.10 into Eq. 6.9 we obtain for each = 1,...,N 


N 
a ay (>: us ay gj ax = a g; ax 
dp \o D 
x 
Yun f Ox Pi Ox Qj a= | to dx, 
i=l 7 2 


(6.13) 


which is a system of N equations since we project the solution on the basis 
functions g; with j = 1,...,.N. In the second equation we switched the sequence 
of integration and sum. The discrete system thus obtained can be written using 
matrix—vector notation. We define the solution vector u (i.e. the coefficients of 
our basis functions) as 


u= : (6.14) 


corresponding to the values at the left and right element boundaries. The source 
vector f can be written as 


Sn f v1 ax 


a 
Inf v2 : (6.15) 


Spf On dx 


and the matrix containing the integral over the basis function derivatives K reads 


K > Kj = wf a. Qi Oy —; ax. (6.16) 

This system of equations can be written in component form as 
uj Ky = fi» (6.17) 
where we use the Einstein summation convention and in matrix—vector notation 
K’ u=f. (6.18) 


Note that here we encounter for the first time an important matrix in finite- 
element analysis: K is called the stiffness matrix, because in its original form it 
contains elastic parameters under the integration sign (here we assume they are 
constant). This system of equations has as many unknowns as equations. Pro- 
vided that the matrix is positive definite, we can determine its inverse. In that case 
the solution to our problem is finally 


u=(K’)'£. (6.19) 


Again, we encounter a fundamental characteristic of finite-element-type prob- 
lems. To find solutions, a linear system of equations has to be solved. The size 
of the system matrix is in general the number of degrees of freedom squared, 
or N? if the domain is discretized with N points ( the corresponding number of 
elements is then N — 1). We will shortly illustrate this with an example. 
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RT u=f 


Fig. 6.5 Static elasticity with boundary 
conditions. Graphical representation of 
the matrix—vector system with boundary 
conditions. The modified global system 
matrix has N-2 x N —2 elements. The 
point in the middle of the source vec- 
tor corresponds to a forcing inside the 
medium. The black top and bottom el- 
ements denote the boundary condition 
modifying the source term f. 


+ In my view this is one reason why 
in seismology the finite-element method 
did not take off as much as the finite- 
difference method when computational 
resources began to make 2D and 3D cal- 
culations practical. 


What are the consequences of having to invert a global system matrix? It im- 
plies that we look into solving a gigantic system of linear equations for realistic 
cases in more than one dimension! We cannot in general expect that the matrix 
is diagonal (which implies trivial inversion, see the chapter on spectral-element 
methods) or simply banded. Therefore, the entire toolbox of linear algebra con- 
cerning the solution of such systems has to be invoked. Solving linear systems 
of equations on parallel computers has become an independent field of research. 
Every new generation of parallel hardware is benchmarked against linear system 
solvers.* 

Before we provide details on how to evaluate this system of equations we briefly 
look at the problem of special boundary conditions. 


6.3.1 Boundary conditions 


In case of a free surface (i.e. stress-free) boundary there is nothing to do, as it is 
implicitly fulfilled. Other boundary conditions can be implemented in a straight- 
forward way. In case you would like to invoke specific values at the boundaries 
the approximate solution becomes 


N-1 


T= u9i+ Yugi + Un, 
1=2 


(6.20) 


where uw, and wy are the boundary values. Injecting this into the weak form we 
obtain, after rearranging terms, 


N-1 
Dun favawde=f fo ax 
i=2 2 Dp 


+ usin) f Ox a Ox Qj dx (6.21) 
D 


+ Gina) f Ox QN dx Qj dx, 
D 


observing that we have basically modified the right-hand side of the equation, 
and thus the source term. It is instructive to show this in a graphical way (see 
Fig. 6.5). For a finite-element system of size N we fix the boundary values and 
therefore reduce the number of unknowns to N — 2. The system feels the boundary 
conditions through a modified source term that affects the solution everywhere. 


6.3.2 Reference element, mapping, stiffness matrix 


When we introduced the basis functions earlier they were defined in the entire 
physical domain D. Another characteristic feature of the finite-element method 
that substantially simplifies the calculations performed on a computer is the fact 
that for all elements basically the same calculations have to be performed, with 


only slight modifications. Therefore, a standard procedure is to map the physical 
domain to a reference element. In our case we centre the local coordinate system 
denoted as & at point x; and obtain 


E = xX-%; 
(6.22) 
hy = %;-Xi-1, 


where h; denotes the size of element 7 defined in the interval x € [x;,x;,,]. After 
this coordinate transform the definition of the local basis functions becomes 
£41 for -h;<& <0 

gié)=41-£ for0<& <hj (6.23) 


0 elsewhere, 


and their derivatives 
+ for -h; < E<0 
d gi(E)=4-L for0<& <h; (6.24) 


0 elsewhere. 


We can now proceed with calculating the elements of the stiffness matrix K 
defined as 


Kj = uf IxQ; Ox P; Ax, (6.25) 
D 


with the corresponding expression in local coordinates & 
Ky =n [devi eo, ab, (6.26) 


Before solving these integrals, let us pause for a moment and reflect on what the 
structure of this matrix might look like. First, we consider the basis functions 
and their derivatives in the local coordinate system. This is illustrated in Fig. 6.6. 
We note that they overlap only for adjacent indices, leading to a banded matrix 
structure. Because the basis functions and their derivatives are not differentiable 
at the element boundaries x; the integrals have to be evaluated for each elemental 
domain separately. Let us calculate some of the elements of matrix Kj, starting 
with the diagonal elements. For example, for Ki; we obtain 


Riu =n f Ox G1 Ox Gy ax 
D 


h h 
-1-1 bh ue 
ef pane fat 
where we only have to perform the integration over element 7 = 1, since 0, g, = 0 
elsewhere. For the next diagonal element K22 we see that the derivatives overlap in 


(6.27) 
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Basis function y; 


(aa 


Gradient 0, 


m2 il i itl a2 
Element edges 


Fig. 6.6 Basis functions and their 
derivatives. Top: The basis function 
y; (thick solid line) 1s shown along with 
the neighbouring functions Qj+, (thin 
dashed lines). Bottom: The same for 
their derivatives with respect to the space 
coordinate &. 
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Table 6.1 1D static elastic problem 
with the finite-element method. Homo- 
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both elements 1 and 2, implying that in the local coordinate system the integration 
has to be performed for the interval € € [-/, h] (assuming for simplicity here that 
the elements are all of the same size). 


Ky = wf Ox P2 Ox Pz Ax 
D 


0 h 
é w | a 92 Be on as +n f de G2 8¢ 2 dé (6.28) 
-h 0 


Equivalently, the off-diagonal terms overlap only in one element, for example 


Ka =n f ao Ox 2 dx 
D 


h hy] 
a a d&é = —-d 
wf i Pi Oe 2 dé wy Ahe 


(6.29) 


Ra = Ry. 


Finally, the stiffness matrix for an elastic physical system with constant shear 
modulus « and uniform element size h (see Section 6.4.2 for the general 
case) reads 


ial 
See 
Kj = 


Y 


(6.30) 


Sik 


-1 2-1 
-1 1 


and these numbers probably ring some bells. It might not come as a surprise that 
the space-dependent terms in our linear system are proportional to the three-point 
operator matrix for a second finite-difference derivative. This will be discussed in 
more detail shortly. Let us turn this into a computer program. 


geneous Case. 


Parameter Value 6.3.3 Simulation example 

Xmax 1 We demonstrate the finite-element solution to the static elastic problem with a 
nx 20 simple toy problem. The parameters are given in Table 6.1. The physical domain 
ML 1 is defined in the interval x € [0, 1] and we apply a unit forcing at x = 0.75 at one 
h 0.0526 — of the element boundary points. 

u(0) 0.15 The following Python code fragment presents a possible implementation of 
u(1) 0.05 the finite-element solution and the calculation of the stiffness matrix in the case 


of constant element size and shear modulus. 


# [...] 

# Basic parameters 

nx = 20 # Number of boundary points 
u = zeros (nx) # Solution vector 

f = zeros (nx) # Source vector 

mu = 1 # Constant shear modulus 


# Element boundary points 
x = linspace(0O, 1, nx) # x in [0,1] 
h = x[2] - x[1] # Constant element size 
# Assemble stiffness matrix K_ij 
K = zeros((nx, nx) ) 
for i in range(1, nx-1): 
for j in range(1, nx-1): 


if i == j: 
K[i, j] = 2 * mu/h 
elif i == j +1: 
K[i, j] = -mu/h 
elif i+ 1 == j: 
K[i, j] = -mu/h 
else: 
Kat, 3h =: 0 
# Souce term is a spike at i = 15 
£15): tT 


# Boundary condition at x = 0 

u[0] = 0.15 ; £[1] = u[0]/h 

# Boundary condition at x = 1 

u[nx-1] = 0.05 ; £[nx-2] = u[nx-1]/h 

# finite element solution 

u[l:nx-1] = linalg.inv(K[1:nx-1, 1:nx-1]) @ f[1:nx-1].T 
# [...] 


At this point we make a short detour and ask how this static problem might be 
solved with finite differences for comparison. Starting with the Poisson equation 
—d2u = f, omitting space dependencies, we replace the left-hand side with a 
finite-difference approximation and obtain 


ay Eee) tu(xth) _ 


gp fs (6.31) 
and, after rearranging, 
-h)tu(xth) Wh 
ih = EEE I (6.32) 
2 2u 


How should we interpret this equation? An update at point x is obtained by av- 
eraging the two surrounding points plus some scaled source term. We should not 
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u(x) 


(0) 0.5 1 
x 


Fig. 6.7 Static elasticity with boundary 
conditions. Simulation example compar- 
ing the finite-element solution (thick solid 
line) with a finite-difference-based relax- 
ation method (thin lines) that iteratively 
converges to the correct solution (see text 
for details). 


expect this to lead to the correct solution in one step. But this equation can be 
used as an iterative procedure with an initial guess for the unknown field vu. With 
discretization of uw; = u(x;) and iteration step 2 this can be written as 


k k 2 
Sa RA h 
ket — Min F Mia a (6.33) 


We = ve 
: 2 2 


with initial guess “ = 0. This is implemented in the following script, and we 
compare it to the finite-element solution. This approach is called a relaxation 


method. 


# [...] 
# Forcing 
£[15] = 1/h # force vector 
for it in range(nt): 
# Calculate the average of u (omit boundaries) 
for i in range(1, nx-1): 
du[i] = u[i+1] + ul[i-1] 
u= 0.5 * (£ * h**2/mu + du) 


ulo] = 0.15 # Boundary condition at x=0 
u[nx-1] = 0.05 # Boundary condition at x=1 
# [...] 


The results are shown in Fig. 6.7. The finite-element solution reaches the solution 
to the problem in one step involving the inversion of the stiffness matrix K. In this 
example 500 iterations were employed for the relaxation method and the solution 
is shown after every 25 iterations. Obviously, the finite-element method is much 
faster. An interesting point is that the force term in the finite-element method 
is divided by the element width / by means of an integral over the derivative of 
the basis function. In the finite-difference approximation the division by / comes 
from the requirement to have the final solution independent of the grid distance 
(see Section 4.4 on source injection in Chapter 4 on the finite-difference method). 

It is instructive to extend this simplest case to arbitrary element sizes and 
space-varying elastic properties (see exercises). In the next section we add time 
dependence to our system. 


6.4 1D elastic wave equation 


Let us apply what we have learned about the finite-element approach and the 
Galerkin principle to the 1D elastic wave equation 


pd7u = dy UW duty (6.34) 


where again we omit space and time dependencies. From now on we assume 
that the properties of the medium, density p, and shear modulus p, are both 


space-dependent. We obtain a weak form of the wave equation by integrating 
over the entire physical domain D and at the same time multiplying the original 
equation with an arbitrary basis function g;: 


fovugde=f awaugj des f fo ax (6.35) 
D D D 


Integration by parts of the term containing the space derivatives leads to 


/ Ox HL Ox UD; dX = [19, u 9;| -[ JL Ox U Ox Qj AX; (6.36) 
D D 


and again we drop the term with the antiderivative that is evaluated at the edges 
of the physical domain. This is equivalent to a stress-free boundary condition as 
discussed above. Reinjecting this result into the weak form, we obtain 


fovug acs | wauagde=f fod (6.37) 
D D D 


where uw is the continuous unknown displacement field. We replace the exact 
displacement field by an approximation 7% of the form 


“ 
u(x,t) > W(x, t) = D> u(t) @i(x)s (6.38) 


i=1 


where the coefficients u; are expected to correspond to a discrete representation of 
the solution field at the element boundaries, following earlier the discussion on lin- 
ear basis functions. The approximate displacement field is of course constrained 
by the same wave equation, thus 


[ovis f wamagde=f fea. (6.39) 
D D D 


Injecting Eq. 6.38 into Eq. 6.39, we can turn the continuous weak form into a 
system of linear equations: 


N 
fo a7 bs uj(t) *) gj ax 


i=1 


N 
+ i, [dy (> u;(1) *] ee (6.40) 


i=1 


=| fo dx, 
D 


where we highlight the fact that only the coefficients u;(¢) are time-dependent. 
Changing the order of integration and summation, we obtain 


N N 
> auc fp Qi Pj AX + pa utd fm Ox Qi Ox Qj AX = [is gy; dx, (6.41) 
i=l 


i=1 


using the fact that the unknown coefficients u; only depend on time. 
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As in the static elastic case we proceed using matrix—vector notation, with 
the following definitions for the time-dependent solution vector of displacement 
values u(t), mass matrix M (sensible name, as it contains density), the already 
well-known stiffness matrix K, and the source vector f: 


u(t) > ui) 


M > My= fp 95a 
D 


(6.42) 
D 
f>f= / fg; dx. 
D 

Thus we can write the system of equations as 

a7uM + uK = f, (6.43) 
or with transposed system matrices as 

M’d7u+K/u=f. (6.44) 


For the second time derivative we use a standard finite-difference approximation: 


_ ut t+ dt) -2u@ + u(t— dt) 


2 
aru i : (6.45) 
replacing the original partial derivative with respect to time to obtain 
mM? a + dt) -—2u(t) + “—] -£-Klu (6.46) 
dt? : , 


This system can be extrapolated in an analoguous fashion to the other numerical 
methods we have already encountered. Starting from an initial state u(t = 0) = 0 
we can determine the displacement field at time ¢ + dt by 


u(t + dt) = dt?(M")" [f-K*u] + 2u(d) -u(r- dz). (6.47) 


This equation will be implemented in the computer program. Again, like in the 
static case, the solution depends on the inversion of a global system matrix, in this 
case the mass matrix. In the general case, for 2D or 3D problems this huge matrix 
might be sparse but there is no way around a global solution scheme unless we 
find a scheme for a diagonal mass matrix. This is in fact possible, with the right 
choice of basis functions (Lagrange polynomials) and a corresponding numerical 
integration scheme (Gauss integration). This is a specific form of the spectral- 
element method that is discussed in Chapter 7. Another observation is that in 
this formulation the mass and stiffness matrices do not depend on time. For the 
mass matrix this implies that it can be inverted once and for all prior to the time 
extrapolation as an initialization step. What remains to be done before presenting 
examples is to discuss how the system matrices can be calculated. 


6.4.1 The system matrices 


To calculate the entries of the system matrices we transform the space coordinate 
into a local system, as we have done in the static case 
§ =x-X; 
(6.48) 
hj = Xix1 — Xi. 
However, now we allow the element size /; to vary. With the definition above 
element 7 is defined in the interval x € [x;, x;,,]. In the local coordinate system the 


basis functions are defined by 


> +1 for -h-y <& <0 


gE)=41-£  for0<é <h; (6.49) 
0 elsewhere, 
with the corresponding derivatives 
i for — hin = & <0 
de vi) =4-+ for0<& <h (6.50) 


0 elsewhere. 


An example of basis functions and their derivatives for an irregular grid with 
varying element sizes is shown in Fig. 6.8. We are ready to assemble our system 


matrices. 
The mass matrix 


Looking at the global definition of the mass matrix M with components 


Mj; = PO dx (6.51) 


and considering the specific nature of our basis functions (see Fig. 6.8), we realize 
that the only non-zero entries are around the diagonal and are of components 
Min M;; and Mii; for 7 = 2,..,N-—1. Elements My; and Myn have to be 
treated separately. For the diagonal elements we obtain 


My= [| oo i ax = [ PY; Q; dE (6.52) 
D Dz 


in the local coordinate system. Integration has to be carried out over the elements 
to the left and right of the boundary points x;. We thus obtain 


0 E 2 h; é 2 
Pi-1 [. (4 a 1) dé +o f (1 7 ~) dé 


1 
= (pi-1 Wis + p; Ni) 5 


Mi 
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Basis function (derivative) 7 


Fig. 6.8 Basis functions and deriva- 
tives for irregular mesh. Example of a 
finite-element domain with irregular el- 
ement sizes h;. The basis functions (thick 
solid lines) are illustrated with the nor- 
malized derivatives (thin solid lines). 
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where we assume that the density is constant inside each element (otherwise we 
would have to use numerical integration). For the off-diagonal elements the basis 
functions overlap only in one element, for example 


Nee = i ae] Sype h (6.54) 
a Pi-1 as hi : hin 6 Pi-1 M1 . 
or 
h; 
Miivi = p | “(i= z 5 ae eo pi hy. (6.55) 
: . h; ) hy 6 


It is instructive to sketch the integration intervals and the relevant basis functions 
using Eq. 6.49 to visualize the above formulation (see Fig. 6.8). 

The banded nature of the mass matrix, assuming constant element size / and 
density o, reads 


- ; 
,{ 141 
M=* 141 (6.56) 
141 
0 


In the general case, with varying element size the mass matrix is not symmetric. 


The stiffness matrix 


The same concepts apply to the stiffness matrix. We move to the local coordinate 
system by 


Kj = fu Ox Di Ox pj ax = A LL Og Yj Og Qj dé (6.57) 
& 


and obtain for a diagonal element, assuming constant shear modulus ww inside 


0 1 2 hy 1 2 
Ki = wis [ € ) d& + ui | (-;,) dé 
hj_1 7-1 0 7 (6.58) 


= LMi-1 aS ES 


hia hi; 


each element, 


and for the off-diagonal elements 


sat ane 1 Mi 
Kivi = hi a — ) dé =-— 
wen LG) GR) 8 
1 1 Mi-t 
Kii- =n: f ( iG ) ae =- - > 
0 Soicy ea] \ Tia his 


while all other elements of the stiffness matrix are zero. For example, assuming 


(6.59) 


constant shear modulus and element size the stiffness matrix reads 


; 0 
ie eo 
K== poe We (6.60) 
ai Bi 
0 


At this point we have all the ingredients to assemble a program simulating 1D 
elastic wave propagtion in a heterogeneous medium with irregular elements. 


6.4.2 Simulation example 


Let us initialize a 1D physical domain and simulate elastic wave propagation with 
the algorithm developed above. We start with a homogeneous domain on a reg- 
ular grid and compare with the finite-difference method. The parameters for the 
simulation are given in Table 6.2. Before showing the results we would like to illus- 
trate the striking similarity of the final algorithm to the finite-difference method. 
Despite the fact that we introduced the finite-difference approach through a local 
view, it can also be formulated in matrix—vector form (we have used this already 
in the case of the Chebyshev pseudospectral method). 

It is easy to show (see exercises) that if we initialize a differentiation matrix D 
for the second derivative based on finite differences we obtain (according to the 
wave equation we multiply by jz) 


—2 1 
12 1 
D= — a (6.61) 
1-2 1 
1 -2 
and, defining a diagonal mass matrix Mznv containing the inverse densities, we 


can extrapolate the finite-difference scheme with the algorithm shown in the 
following Python code fragment: 


# Time extrapolation 
for it in range(nt): 
# Finite Difference Method 
unew = (dt**2)*Minv @ (D @ u + £/dx*src[it])\ 
+ 2*xu - uold 
uold, u = u, unew 


een 


It is instructive to compare the matrix structures with the finite-element method 
in a graphical way. This is shown in Fig. 6.9. As already indicated, the stiff- 
ness matrix is basically equivalent to the matrix form of the second derivative 
operator. Only the inverse mass matrices have different structure. While in the 
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Table 6.2 Simulation parameters for 


1D elastic simulation. Homogeneous 


case. 
Parameter Value 

Nase 10,000 m 
nx 1,000 

Us 3,000 m/s 

p 2,500 kg/m? 
h 10m 

eps 0.5 

tho 20 Hz 
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Fig. 6.9 System matrices. The structure 
of the system matrices for the finite- 
element method are compared with the 
finite-difference method formulated with 
matrix—vector operations. Top row: 
Stiffness and inverse mass matrix for the 
finite-element method. Bottom row: 
Stiffness (differential) matrix and diag- 
onal mass matrix for the finite-difference 
method. 


Finite elements 


Finite differences 


“. 
~ 


finite-difference method it is diagonal, the finite-element-based mass matrix is 
banded around the diagonal. 

The following code example illustrates the calculation of the mass matrix for 
the general case of varying element size: 


# Mass matrix M_ij 

M = zeros((nx, nx) ) 

for i in range(1, nx-1): 
for j in range(1, nx-1): 


T£1 == 78 
M[i, j] = (ro[i-1] * h[i-1] 
+ ro[i] * h[i]) / 3 
elif j == i+ 1s: 
M[i, j] = ro[i] * h[il]l/6 
elif j == i- 1: 
M[i, j] = ro[i-1] * h[i-1]/6 
else: 
M[i, j] = 0 
# Corner elements 
M[0,0] = ro[0] * h[0] / 3 
M[nx-1, nx-1] = ro[nx-1] * h[nx-2] / 3 


The implementation of the finite-element time extrapolation of the coefficients 
is almost identical to the finite-difference algorithm just presented. The mass 
matrix is inverted before the time extrapolation starts. The source is injected at 
one boundary element point. The source time function is a first derivative of a 
Gaussian. 


# Time extrapolation 
for it in range(nt): 
# Finite Element Method 
unew = (dt**2)*Minv @ (f*src[it] - K @ u) 
+ 2*u - uold 
uold, u = u, unew 


# [...] 


In all listings above / corresponds to the element size, src contains the source time 
function, u, new, uold are the displacement fields at t+ dt, and f is the force vector. 

Results of the simulation are shown in Fig 6.10 for various propagation dis- 
tances and compared with the finite-difference method. It is interesting to note 
that in this case the finite-difference and the finite-element methods have basically 
the same dispersive behaviour, but with opposite sign (i.e. in the finite-element 
method the high frequencies arrive earlier, like in the pseudospectral method). 

Finally, let us explore one of the most important advantages of the finite- 
element method; the simplicity with which the element size can vary. This is also 
called h-adaptivity (referring to the adaptation of element size / in our case due 
to geometrical features or the velocity model). 


h-adaptivity 


Think about an Earth model in which the seismic velocities have strong variations. 
This is more the rule than the exception. An example is the Earth’s mantle. When 
including the P-velocity in the oceans (1.5 km/s) the seismic velocities span al- 
most an entire magnitude given the maximum P-velocities above the core—mantle 
boundary (13 km/s). Any numerical scheme with globally constant element size 
has to be accurate for the shortest wavelength. Therefore, regions with higher 
velocities might be substantially oversampled, reducing computational efficiency. 

The numerical methods we have encountered so far (finite differences and 
pseudospectral methods) are not really able to solve this problem in the most 
general case. The finite-element method with options for deformed hexahedral 
or tetrahedral elements offers this flexibility. We demonstrate this in the 1D case 
with a strongly heterogeneous velocity model in which the number of grid points 
per wavelength is kept constant in the entire physical domain. The parameters 
for this model with three different subdomains are given in Table 6.3. The model 
mimics (in a slightly exaggerated way) the situation in a fault zone with a central 
low-velocity zone (damage zone) with different material properties on the two 
sides of the fault (this situation is examined with the 2D finite-difference code; 
see exercises). 

The seismic velocities differ by a factor of 4 (in seismological terms this is in- 
deed a dramatic velocity contrast). We keep the number of grid points constant 
across the model by adapting the element size accordingly in each domain. A 
source is injected in the centre of the low-velocity zone. The results of the sim- 
ulation are shown for the entire domain as a function of time in Fig. 6.11. The 
boundaries act as free surfaces, and thus reflect the entire wavefield with reversed 
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475.626 m 


Fig. 6.10 Finite-element simulation. 
Snapshots of the displacement wave- 
field calculated with the finite-element 
method (solid line) are compared with 
the finite-difference method (dotted line) 
at various distances from the source, 
using the same parameters. The length 
of the window is 500 m. 
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Fig. 6.11 Finite-element simulation 
with varying element size. Snapshots 
of displacement values are shown as a 
function of time. Where displacement 
amplitudes are below a threshold, the 
velocity model is shown in greyscale. 
The parameters for this simulation are 
given in Table 6.3. The element size is 
defined such that the number of grid 
points per wavelength is constant in 
the entire physical domain. Note the 
polarity change of the reflections at the 
boundaries and the slope of the signals in 
the x —t plane indicating their velocities. 


Table 6.3 Simulation parameters for 1D elastic simulation. Heterogeneous case. 


Left Middle Right 
x 4,600 m 1,000 m 4,600 m 
Us 6,000 m/s 1,500 m/s 3,000 m/s 
dx 40m 10m 20m 
p 2,500 kg/m? 2,500 kg/m? 2,500 kg/m? 
Parameter Value 
nt 18,000 
dt 3.3 ms 
to 5 Hz 
eps 0.5 


Time (s) 


VY 


1,000 2,000 3,000 4,000 5,000 6,000 7,000 8,000 9,000 10,000 


x Gm) 


polarity. The results illustrate the reverberating effect of the low-velocity zone. 
The dominant frequency in this case is sampled with 30 elements per wavelength. 
Note that in this case the h-adaptive mesh also leads to an equal sampling of the 
wavefield in time within each subdomain (see exercises). 

Let us have a close-up look at one of the boundaries where the element size 
changes. This is shown in Fig. 6.12. There are several points to notice. First, be- 
cause of the large velocity change the wavelength in the left domain is substantially 


longer. Second, note the continuous, but non-differentiable behaviour of the 
wavefield at the boundary. The advantage here is that we do not have to take 
space derivatives across the domain boundary as would be the case with the 
finite-difference method. 

The theory and the applications presented so far were based on the simplest 
linear-basis functions; in terms of polynomial order this corresponds to order 1. In 
the following section, we briefly illustrate how this classical ansatz can be extended 
to higher orders in space. 


6.5 Shape functions in 1D 


How can we formally derive basis functions for finite-element discretizations? Is 
there a way to improve accuracy in space by moving to high orders? 

Let us recall how we replaced the originally continuous unknown field u(x) by 
a sum over some basis functions 9;: 


N 
u(x) = D> a@i(x), (6.62) 
i=1 
where x € D is defined in the entire physical domain. Here, we denote the co- 
efficients of the basis functions by c;. This facilitates the notation when going to 
higher orders. As mentioned before, a standard procedure in finite-element analy- 
sis is to map all elements to a local coordinate system, which makes life easier as all 
integrals can be calculated in the same way (except for some constants). We define 
xX—X; 
§=—_, (6.63) 
Xi+1 — Xi 
where our reference element is defined with € € [0,1]. In the following we 
present a formal procedure to derive the so-called shape functions that are used 
to describe the solution field. Even though we have encountered the linear form 
of these functions already, this procedure will in principle allow extensions to any 
order. 


Linear shape functions 


We put ourselves at element level and assume that our unknown function u(&) is 
linear: 


u(§) = +O, (6.64) 


where c; are real coefficients. Each element has two node points, namely the 
element boundaries at £12 = 0,1. This leads to the following conditions and 
solutions for coefficients c; 

UHCy aq Huy, 


(6.65) 


Uz = Cy +02 > C2 = Uy + U2 
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Fig. 6.12 Finite element simulation, 
domain boundary. Detail of the finite- 
element simulation (snapshot of the dis- 
placement field) with varying element 
size at one of the domain bound- 
artes. The crosses indicate the element 
boundaries and the changing element 
size. Note the continuous but non- 
differentiable behaviour of the displace- 
ment field at the interface. 


174 = The Finite-Element Method 


0 0.5 1 


Fig. 6.13 Finite-element shape func- 
tions. Top: Linear shape functions as 
used in the development of the fintte- 
element solution to static and dynamic 
elastic problems. Node points are indi- 
cated by crosses. Bottom: Quadratic 
shape functions requiring one more node 


point at the centre of the element. 


This can be written in matrix notation, which will help us when dealing with 
high-order systems. We obtain 


uy = 10 Cy : (6.66) 
u2 11 (9) 
and, using matrix inversion, 
1 olf 
a fe ae (6.67) 
C2 -11 u2 
With appropriate matrix and vector definitions this can be written as 
u=Ac>c=A™u, (6.68) 


implying that to obtain coefficients c we need to calculate the inverse of A. Putting 
the coefficients thus obtained back into the original equation we obtain 
u(§) = uy + Cu + u2)& 
= uy (1-&) + mg 
= uN, (&) + wN2), 


where we introduced a novel concept, the shape functions N;(€) with the 
following form 


(6.69) 


Ni) = 1-8; Nag) = 6. 


This is a fundamental concept holding both for finite- and spectral-element meth- 
ods in this nodal form: One elemental shape function is multiplied by a value of 


(6.70) 


the solution field (e.g. the displacement field) at a specific node point. The sum 
over the weighted shape function of general order N 
N 


u(é) =) uiNi(€) 


1=1 


(6.71) 


gives the approximate continuous representation of the solution field u(é) inside 
the element. The shape functions are illustrated in Fig. 6.13. 

Quadratic shape functions 

Extending these concepts to higher orders is straightforward. Describing our 


solution field by quadratic functions requires 


u(E) = C1 + o2€ + 6387, (6.72) 


where we added one more node point at the centre of the element &;,23 = 0, 1/2, 1. 
With these node locations we obtain 

uy = Cj 

uz = C1 + 0.5c2 + 0.25c¢3 (6.73) 


u3 = Cy + C2 + C3, 


and after inverting the resulting system matrix A 


100 
Al=|]-34 -1 (6.74) 
2 -42 


we can represent the final quadratic solution field inside the element with 
u(E) = C1 + ca& + 0387 
=u, (1-3 + 2&7)+ 


ur(4 — 46°) + 
u3(-§ + 267), 


(6.75) 


resulting in the following shape functions 


Ni () = 1-3 + 28? 
N3(&) = 4€ - 4&7 (6.76) 
N3(€) = -€ + 28’, 


illustrated in Fig. 6.13 (bottom). The extension to cubic shape functions is 
straightforward but in that case derivative information at the boundaries is 
necessary to constrain the coefficients. 


6.6 Shape functions in 2D 


Shape functions start getting interesting with more dimensions. The most fre- 
quently used element shapes in 2D are triangles (e.g. after Delauney triangulation 
of arbitrary point clouds) and rectangles. We limit ourselves to the linear case. 


Triangular shape functions 


We start with a triangle of arbitrary shape defined somewhere in x—y space. This is 
illustrated in Fig. 6.14. To perform the integration operations when calculating the 
system matrices we move to the local coordinate system &,7 € [0, 1] (sometimes 
the reference space is chosen to be [—1, 1]), through 


X= Xy + (K.-H )E + (x3 - 1) 


(6.77) 
y= + G2-M)E + 3-1) N. 


We seek to describe a linear function inside our triangle; therefore 


u(&,1) = cy + c2§ + 3. (6.78) 
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Fig. 6.14 Triangular elements. Map- 
ping of physical coordinates (x,y, top) 
to a local reference frame (bottom) with 
coordinates —,n. 


176 = The Finite-Element Method 


Fig. 6.15 Triangular elements, shape 
functions. The three corner nodes lead to 
an equivalent number of shape functions 
N;(é,) with unit value at one of the 
corners. 


0 0.5 1 
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Fig. 6.16 Quadrilateral elements. 
Mapping of physical coordinates (x,y, 
top) to a local reference frame (bottom) 


with coordinates &,n. 


Ni) 


No& 1) N35 9) 


We only know our function at the corners of the reference triangle; therefore the 
constraints for coefficients c; are 
uy = u(0,0) = 4 
uz = u(1,0) =a, +e (6.79) 
uz = u(0,1) = cy +63. 
This leads, using the same matrix inversion approach described above, to the 
following shape functions for triangular elements: 
Ni(§59) = 1-&-n 
N23) =& 
N3(&5) = 7. 


These shape functions are illustrated in Fig. 6.15. Again, extension to higher 


(6.80) 


orders is possible. 


Rectangular shape functions 


Similarly, shape functions can be derived for general quadrilateral elements. An 
example is shown in Fig. 6.16. We map space to a local coordinate system through 


x= xy + (x2 —H1)E + (X4 — x1) + (3 — x2)EN (6.81) 
Y= + O2-IMDE + (Ya I) + (3 — 92). 
Requiring linear behaviour of the function inside the element 
u(Esn) = ¢1 + & + ¢3n + coEN (6.82) 
we obtain the following shape functions: 
NiG€3n) = A-&)G-7) 
N2€3n) = &(1-n) (6.83) 


N3(E5n) = &n 
Na(&,n) = (1-&)n. 


Ny (3 1) NE, 1) 


These four shape functions are illustrated in Fig. 6.17. There are many more as- 
pects to shape functions, in particular in connection with numerical integration. 
The reader is referred to the extensive coverage in Zienkiewicz et al. (2013). 
Numerical integration will be discussed in Chapter 7 in connection with the 
spectral-element method. 


6.7 The road to 3D 


The extension of the simple finite-element codes presented here to 2D and 3D 
is substantially more involved than for 3D finite-difference or pseudospectral 
aproaches. The reason is that in 2D the global system matrices already be- 
come huge, and handling those requires getting into linear algebra algorithms. 
This goes far beyond the scope of this introductory text (even Zienkiewicz et al. 
(2013) only touches the problem). However, usually this is handled by libraries 
(like LAPACK) that are specifically designed and optimized to solve sparse sys- 
tems. Those interested in descriptions of 3D implementations and strategies 
for parallelization are referred to Bao et al. (1996) and Bielak et al. (1998). 
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Fig. 6.17 Rectangular reference ele- 
ments. The four linear shape functions 
Ni(,) with unit value at one of the 
corners. 
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Recent extensions include the introduction of spatially adaptive meshes using the 
octree approach (Bielak et al., 2005). 

To my knowledge it is fair to say that the classic low-order finite-element 
method plays a minor role today for large-scale computational problems in seis- 
mology, exploration seismics, and rupture dynamics, in comparison with the 
finite-difference or the spectral-element methods. The reason is, on one hand, 
that extensions of the finite-element method towards higher orders using a com- 
bination of Lagrange polynomials and Gauss integration leads to a global system 
of equations that does not require linear algebra tools for handling huge matrix 
inversion (this is called the spectral-element method, see Chapter 7). On the other 
hand, the resulting explicit extrapolation algorithm can be parallelized using do- 
main decomposition in a very efficient way, and does not need to rely on linear 
algebra libraries. 

Other flavours, like the finite-element discontinuous Galerkin method, have 
recently been introduced to seismic wave propagation, provoking a lot of inter- 
est, in particular for dynamic rupture problems and wave propagation through 
media with highly complex geometrical features. Because of the importance of 
these developments in current research, each of these methods receives its own 
chapter here. 


Chapter summary 


e The finite-element method was originally developed for static structural 
engineering problems. 


© The element concept relates to describing the solution field in an analogous 
way inside each element, thereby facilitating the required calculations of 
the system matrices. 


e The finite-element approach can in principle be applied to elements of 
arbitrary shape. Most used shapes are triangles (tetrahedral geometries) 
or quadrilaterals (hexahedral geometries). 


e The finite-element method is a series expansion method. The continuous 
solution field is replaced by a finite sum over (not necessarily orthogonal) 
basis functions. 


e For static elastic problems or the elastic wave-propagation problem, finite- 
element analysis leads to a (large) system of linear equations. In general, 
the matrices are of size N x N where N is the number of degrees of 
freedom. 


e Because of the specific interpolation properties of the basis functions, their 
coefficients take the meaning of the values of the solution field at specific 
node points. 
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e In an initialization step, the global stiffness and mass matrices have to be 
calculated. They depend on integrals over products of basis functions and 
their derivatives. 


e If equation parameters (e.g. elastic parameters, density) vary inside 
elements, numerical integration has to be performed. 


e ‘The stress-free surface condition is implicitly solved. This is a major ad- 
vantage compared to other methods (e.g. the finite-difference method), 
particularly in the presence of surface topography. 


e ‘The classic finite-element method plays a minor role in seismology com- 
pared with its higher-order extension. The spectral-element method leads 
to a fully explicit scheme, which is easier to implement on parallel 
hardware. 


FURTHER READING 


e Zienkiewicz et al. (2013) provides an excellent introduction to the basics of 
finite-element analysis, with an engineering focus. 


e Durran (1999) contains a compact introduction to the finite-element 
method. He discusses several other numerical methods for wave- 
propagation problems in a comparative way. 


e Strang (1988) and later editions is another classic book on finite-element 
analysis with very clear descriptions of simple 1D problems. 


e Anice introduction to high-performance computing with examples in linear 
algebra (solution of large linear systems) is given in the recent book by 
Eijkhout (2015) (freely available as a PDF). 


EXERCISES 


Comprehension questions 


(6.1) In which community was the finite-element method primarily developed? 
Give some typical problems. 

(6.2) What are weak and strong forms of partial differential equations? Give 
examples. 

(6.3) Discuss the pros and cons of the finite-element method vs. low-order 
finite-difference methods. 

(6.4) Present and discuss problem classes that can be handled well with the 
finite-element method. Compare with problems better handled with other 
methods. 
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(6.5) 


(6.6) 
(6.7) 


(6.8) 
(6.9) 
(6.10) 


(6.11) 


Compare the spatial discretization strategies of finite-element and finite- 
difference methods. 

Describe the derivation strategy of finite-element shape functions. 
Discuss qualitatively (use sketches) the use of basis functions. Compare 
with the interpolation properties of the pseudospectral method. 

Is the finite-element method a global or a local scheme? Explain. 

Why does the finite-element method require the solution of a (possibly 
huge) system of linear equations? What is the consequence for parallel 
computing? 

Why is the classic linear finite-element method not used so much for 
seismological research today? 

Explain the benefits of the finite-element method with respect to Earth 
models with complex geometries. 


Theoretical problems 


(6.12) 


(6.13) 


6.14) 


6.15) 


6.16) 


6.17) 


The advection equation is 
0:q(X5 t) + c(x) 0xg(x; t) = 0, 


where q(x, £) is the scalar quantity to be advected and c(x) is the advection 
velocity. Write down the weak form of this equation and perform integra- 
tion by parts. What happens to the anti-derivative? Does it cancel out at 
the boundaries as in the 1D elastic wave equation? Note: This is the point 
of departure for the discontinuous Galerkin method. 

Are the linear basis functions 


El for x1 <x <x; 


XX 

Pi(X) = 4 AHI™ for x; < x < X44 
X41 Xi 
0) elsewhere 


orthogonal? 

Derive the formulae for the calculation of the mass matrix elements 
(Eq. 6.54 and Eq. 6.55) by sketching the integration interval, and the 
corresponding basis functions. 

Calculate all entries of the stiffness matrix Kj = f p Hox G; Ox 9; for a static 
elastic problem with x = 70 GPa and h = 1 m for a problem with n = 5 
degrees of freedom. 

A finite-element system has the following parameters: Element sizes h = 
[1,3,0.5,2,4], density p = [2,3,2,3,2] kg/m. Calculate the entries of 
the mass matrix given by M; = /f, pP $i 9; dx using linear basis functions. 
h-adaptivity. For the simulation with varying velocities and element size 
with parameters given in Table 6.3, calculate the time step required for 
€ = 0.5 in each of the subdomains. Remember that the stability criterion 
IS € = Cygxdt/dx where Cg, is the maximum velocity in the entire physical 
domain. Discuss the result. 


(6.18) 


(6.19) 


(6.20) 


(6.21) 


Follow the approach of the derivation of shape functions and derive the 
cubic case in 1D: u(x) = cy + c9€ + c3€* + cy€3. What are key differences 
compared to quadratic and linear cases? 

Derive the quadratic shape functions N(é,7) for 2D triangles with the 
following node points: 


P; (0, 0), P2(1; 0), P3(0, 1), 
P4(1/2, 0), Ps(1/2; 1/2), P6 (0, 1/2). 


Note: Use Python (or another program) to solve the linear system of 
equations. 

Derive the quadratic shape functions N(é,7) for 2D rectangles with the 
following node points: 


P, (0, 0), P2(1/2, 0), P3(1, 0), Pa, 1/2), 
P5(1, 1), Po (1/2; 1), P7(; 1), Ps (0, 1/2). 


Derive the derivative matrix D for the finite-difference-based second 
derivative (Eq. 6.61). Show that when applied to a vector u that con- 
tains an appropriate function (e.g. a Gaussian, sin function) you obtain 
an approximation of its derivative. 


Programming exercises 


(6.22) 


(6.23) 


(6.24) 


(6.25) 


(6.26) 


Write a computer program that solves the 1D static elasticity problem 
(Eq. 6.19) using finite elements. Also code the finite-difference-based re- 
laxation problem (Eq. 6.33) and compare the results. Reproduce Fig. 6.7. 
Extend the formulation to arbitrary element sizes. 

Code the 1D elastic wave equation using finite elements (Eq. 6.47). 
Determine numerically the stability limit and compare with the finite- 
difference solution. Implement the analytical solution for the homoge- 
neous case (note: it is the same as the 1D acoustic wave equation). Com- 
pare the numerical dispersion behaviour of the finite-element method 
with the corresponding low- (or high-) order finite-difference method. 
Initialize a strongly heterogeneous velocity model with spatially varying 
element size. Try to match the results with a regular-grid finite-difference 
implementation of the same model. Discuss the two approaches in terms 
of time step, run time, and memory usage. 

Plot the high-order 2D shape functions derived in the theoretical prob- 
lems above. 

Derive a finite-difference-based centred differentiation matrix for the 
first derivative. Implement the 1D elastic wave equation in matrix form. 
Compare with the 1D finite-element implementation. 
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1 The term spectral should not give the 
impression that the method is based on so- 
lutions of the wave equation in the spectral 
domain. This is not the case. As with all 
other previous methods, the solutions are 
sought in the space-time domain. 


The Spectral-Element 
Method 


The spectral-element method! is currently one of the most widely used numerical 
approaches for seismic wave-propagation problems. Let us briefly compare it with 
the other methods discussed so far and outline why this might be the case. 

The finite-difference method suffers in particular from the difficulties in 
accurately implementing free-surface boundary conditions in the case of realis- 
tic topography. The elegant pseudospectral method exploited for the first time 
the spectral convergence of function interpolation (and derivative) for specific 
choices of basis functions. This will be a central ingredient of the spectral-element 
method, but at a local elemental level. The global communication requirements 
of the pseudospectral method prevent good scaling on parallel hardware, and it 
is thus unattractive for 3D problems. Furthermore, adaptation to models with 
complex geometry is difficult. 

The classic low-order finite-element approach only fixes part of the problem. 
A major advantage it has is the fact that the free-surface boundary condition 
comes for free as it is implicitly solved. In addition, unstructured tetrahedral 
or hexahedral grids are possible, allowing geometrically complex model fea- 
tures. However, a large linear system of equations has to be solved, and this is 
cumbersome to implement efficiently on parallel hardware. 

So what makes the spectral-element method so powerful? By using a specific 
set of basis functions inside the elements—Lagrange polynomials—combined 
with an interpolation scheme based upon the Gauss—Lobatto—Legendre (GLL) 
collocation points, the mass matrix that needs to be inverted in the finite-element 
formulation becomes diagonal (note that this only works with rectangular grids 
in 2D or hexahedral grids in 3D). This implies that the scheme can be explicitly 
extrapolated just like finite-difference or pseudospectral implementations, with- 
out the need to solve a large linear system of equations. In combination with the 
interpolation properties of Lagrange polynomials, this makes the algorithm ex- 
tremely efficient and lends itself to implementation on parallel hardware. As is 
the case with other finite-element-type schemes, the geometrical flexibility comes 
with the requirement to generate a computational mesh that is numerically stable 
(see Fig. 7.1). 

Following a brief section on its history, the focus will be on illustrating the 
various ingredients that make up the power of the spectral-element method. 


Computational Seismology. First Edition. Heiner Igel. 
© Heiner Igel 2017. Published in 2017 by Oxford University Press. 


7.1 History 


The spectral-element method was born out of ideas developed within the frame- 
work of pseudospectral methods, which used the concepts of exact interpolation 
on collocation points with spectral convergence properties.? The similarity with 
the concept of basis functions in the classic finite-element method is obvious and 
the idea was to take advantage of the exponential convergence properties of the 
spectral basis functions. The first appearances of the combined approach were in 
Patera (1984) and Maday and Patera (1989) in fluid dynamics. These authors 
were also the first to use the term ‘spectral elements’. 

Spectral-element formulations for elastic wave problems were first published 
by Priolo et al. (1994), Seriani and Priolo (1994), and Faccioli et al. (1996). In 
these early applications Chebyshev polynomials were used to approximate the 
unknown fields. This improved the dispersion properties compared to classic 
finite-element methods. However, the necessity to invert a large linear system 
of equations still inhibited wide use of the approach. Building on the work by 
Maday and Patera (1989), the breakthrough in seismology came with the work 
of Komatitsch and Vilotte (1998), which introduced the combination of La- 
grange polynomials as interpolants and an integration scheme based on Gauss 
quadrature defined on the GLL points for the elastic wave equation. This led to 
a diagonal mass matrix that can be trivially inverted. As a consequence, a fully 
explicit scheme is possible that is easy to parallelize and has the desired properties 
described in the previous section. 

A further milestone was the implementation of the spectral-element method 
for global wave propagation by Chaljub (2000) and Chaljub et al. (2003) us- 
ing the cubed-sphere concept by Ronchi et al. (1996). This allowed for the 
first time the simulation of the complete wavefield in a 3D heterogeneous spher- 
ical Earth and triggered the development of the specfem3d community code 
(Xomatitsch and Tromp, 2002a; Komatitsch and Tromp, 20028; see Fig. 7.2), 
which is, at the time of writing, undoubtedly one of the best-engineered openly 
accessible wave-propagation simulation tools. Fichtner and Igel (2008) imple- 
mented a spectral-element method for wave propagation in spherical coordinates 
(3D spherical sections) that led to the first ever application of the adjoint inver- 
sion method to regional earthquake data. A recent review paper by Peter et al. 
(2011) demonstrates the great flexibility of the spectral-element method for both 
modelling and inversion. 

The geometrical flexibility of this approach of course comes with the necessity 
to prepare a potentially complex computational mesh.? 

As indicated, the advantages of the spectral-element approach concerning the 
diagonal mass matrix are restricted to hexahedral elements in 3D. This implies 
that the preparation of a mesh from given geophysical data such as the free- 
surface topography, and the curved, discontinuous internal boundaries, can be 
a task that takes weeks to months of hard work, and expertise in the use of mesh- 
ing software (e.g. CUBIT, ANSYS) is required. As indicated in the introduction, 
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Fig. 7.1 A spectral-element mesh to 
model soil-structure interactions with a 
hexahedral grid implementation. Note 
the variations in element size and the 
deformation of the hexahedral elements 
with curved boundaries. Figure courtesy 
of M. Stuppazzini. 


? This is related to the question of how 
fast an approximation converges to the ex- 
act function. In (pseudo-) spectral meth- 
ods the rate of convergence has an ex- 
ponential form, provided the function is 
sufficiently smooth. 

3 Several other spectral-element devel- 
opments are discussed in Chapter 10 
on applications. Freely available spectral- 
element codes are listed in the Appendix. 
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Fig. 7.2 Specfem3d: Logo of the well- 
known community code with a snapshot 
of global wave propagation for a sim- 
ulation of the devastating M9.1 earth- 
quake near Sumatra in December 2004. 
The figure was used as the title page 
of Science on 20 May 2005. Reprinted 
with permission. 


the efficient meshing of specific Earth models for simulation purposes both for 
hexahedral and tetrahedral meshes is still an open issue. 

In the following section we look at some of the basic concepts of spectral ele- 
ments. Subsequently, we develop the complete mathematical formulation for the 
solution of the 1D elastic wave equation. 


7.2 Spectral elements in a nutshell 


Let us set the stage for assembling the various ingredients of the spectral-element 
method applied to the wave-propagation problem. Again, we start with the classic 
1D elastic wave equation: 


po, u = dx (Udxu) +f, (7.1) 


in which the displacement u, external force f, mass density p, and shear modulus 
je depend on x, and uw and f also on ¢. In the following these dependencies are 
implicitly assumed. An important boundary property that has to obeyed is the 
stress-free condition that occurs at the Earth’s surface. This condition is expressed 
as the vanishing of the traction perpendicular to the free surface, thus its normal 
vector 1; 


oi nj = 0, (7.2) 


where oj is the symmetric stress tensor. Using the stress-strain relation and 
adapting to our elastic 1D problem, we obtain 


HM Ox U(X f) = 0, (7.3) 
x=0,L 


where our spatial boundaries are at x = 0, L and the stress-free condition applies 
at both ends (other conditions like absorbing boundaries are also possible). 
Before we dive into the details of numerical analysis, let us illustrate some 
basic concepts graphically. In Fig. 7.3 a snapshot of a 1D displacement wavefield 
simulated with the spectral-element method is shown for a medium with a random 
distribution of elastic parameters. The top figure is a zoomed-in view of one of 
the elements, the central concept of finite-element-type techniques. Inside each 
element we approximate the unknown function uw by a sum over a set of basis 
functions (thin solid lines). In this case the unknown function is approximated by 
a sum over Lagrange polynomials of a specific order. The order determines the 
number of points inside the elements (black squares in Fig. 7.3, top) at which the 
solution is exactly interpolated. This could in principle be achieved for a set of 
regularly spaced points. However, for various reasons that will become apparent, 
it is preferable to use a specific set of collocation points known as the GLL points. 
As is obvious from Fig. 7.3, these points are unevenly spaced. When higher- 
order polynomials are used, the grid points densify even more towards the 


element boundaries. This implies that we have to decrease time steps, while 
keeping everything else the same. As presented in Chapter 6 on the finite-element 
method, we also need to integrate basis functions, their derivatives, and elastic 
parameters over the elements. Thus we need to come up with a numerical inte- 
gration scheme, as an analytical integration is not in general possible. The method 
of choice is a special case of the Gauss quadrature approach, the Gauss—Lobatto— 
Legendre quadrature, that consists of a sum over the function to be integrated, 
evaluated at the GLL points appropriately weighted. These points are equivalent 
to those illustrated in Fig. 7.3, top. It is precisely this fact that leads to a global 
system of matrix equations that can be solved with high efficiency as no global 
system matrix inversion is required. 

Thus, in the following sections, we (1) formulate the weak form of the wave 
equation, and (2) provide the transformation of the equation down to the elemen- 
tal level, introducing the concept of the Jacobian. The discretization of our system 
comes with (3) the approximation of our unknown function u using Lagrange 
polynomials as interpolants. The formulation also requires (4) the evaluation of 
the first derivatives of the Lagrange polynomials, the calculation of which requires 
the Legendre polynomials. (5) The numerical integration scheme based on GLL 
quadrature allows us to calculate all system matrices at elemental level, which are 
then (6) assembled in a final step to obtain the global system of equations that is 
extrapolated in time using a simple finite-difference scheme. 


7.3 Weak form of the elastic equation 


Despite the equivalence to the finite-element method, in order to keep the chapter 
independent we recall briefly the weak form as a starting point for the specific 
spectral-element discretization. 

We multiply both sides of Eq. (7.1) by a time-independent test function v(x). 
This may be any function of the set of functions that are, together with their first 
derivative, square integrable’ over the integration domain D (i.e. a continuous and 
‘well-behaved’ function). D is here the complete computational domain defined 
with x € D = [0, L]. 


/ v p 02u dx — [va aan de= fof as 
D D D 


We integrate the second term of the left-hand side of the above equation by parts, 
to obtain 


(7.4) 


[ ee siuds + | wavauar=f of de (7.5) 
D D D 
where we made use of the boundary condition 

OxU(X, t) | x=0 = OxU(X, f) |xap = 0. (7.6) 
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Fig. 7.3 Principle of spectral element 
discretization. Bottom: Snapshot of the 
displacement field u during a simulation 
in a strongly heterogeneous 1D medium. 
Top: Close-up of the displacement field 
inside one element discretized with order 
N = 4 collocation points, at which the 
solution 1s exactly interpolated using La- 
grange polynomials (€;, grey lines). These 
points are also used for the numerical in- 
tegration scheme. The equation describes 
the interpolation scheme using Lagrange 
polynomials. 


+ If you take the integral of the square 
of the absolute values of a square-integrable 
function, it is finite. 
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> The use of the letter u; for the expan- 
sion coefficients might be surprising here. 
N, is the number of required basis func- 
tions for a specific polynomial order N. 
However, it will turn out that because of 
the choice of the basis functions these co- 
efficients will actually correspond to the 
discrete values of the displacement field at 
grid points at element boundaries and/or 
inside the elements. 


Comparing equations (7.6) and (7.3) we note the equivalence of these conditions. 
This implies that the physical free-surface boundary condition, as in the classic 
finite-element approach, is implicitly fulfilled; an extremely attractive feature for 
3D problems involving the Earth’s surface! We are now left with the problem of 
finding solutions of the displacement field u for arbitrary space-dependent test 
functions v. 

At this point we are still in the continuous world in which solutions are sought 
by analytical means. As we seek to simulate wave propagation in Earth models 
with heterogeneous distributions of elastic parameters we need to find appropri- 
ate discrete representations of the seismic wavefield u with which we can find 
solutions by numerical means. This can be achieved by the Galerkin method. 
Here, we approximate the exact solution for u(x, t) by a finite superposition of 7 
basis functions ¢;(x) with 7 = 1,...,.N, weighted by time-dependent coefficients 
u;(t).> We expect the test functions and the solution to be continuous. Note that 
we do not specify the basis functions here and we are still considering the com- 
plete spatial domain D (this will be changed later when we go down to the element 
level). The approximate displacement field is denoted by (x, 2): 


Np 
u(x,t) ¥ U(x,t) =) ui(d) gi). GP 


i=1 


We expect that the accuracy of this approximation will depend on the specific 
choice of basis function and the number of functions superimposed (i.e. Np). 
In the following we restrict ourselves to the problem of finding solutions to Eq. 
(7.5) for our approximate displacement field 7(x, t). In addition, we make another 
important step by using as test functions the same functions that are used to 
approximate our unknown fields (Galerkin principle), obtaining 


/ Qi p a-u dx +f LL Ox Qj 0x ax = / gi f dx, (7.8) 
D D D 


with the requirement that the medium is at rest at t = 0. Combining Eqs. 7.7 
and 7.8 leads to an equation for the unknown coefficients w;(£) 


Np 


y- Eze if ale) @(2) @r() ax| 


i=1 
Np 


+) > juco f HX) Ox; (X) OxQi (x) ax| (7.9) 
i=l D 


= / vif G0) dx 
D 


for all basis functions g; with j = 1,...,”. This is the well-known equation for 
finite-element problems, which can be written in matrix notation: 


Ma?u(t) + Ku(d) = fd), (7.10) 


with implicit matrix-vector operations. The mass matrix—here defined over the 
entire domain—is 


My = / POGUES de (7.11) 
the stiffness matrix 
Ky = i. i) Aoi) Bail (7.12) 


and the vector containing the volumetric forces f (x, t) 


f(t) = i Coe (7.13) 


This simple matrix equation has to be solved for the space-independent but 
time-dependent coefficients u. The vector of coefficients will take the meaning of 
the actual displacement values at a global set of points imposed by the specific ba- 
sis functions to be introduced shortly (and must not be confused with the classic 
three-component displacement vector in the 3D elastic wave equation). This sys- 
tem of equations is illustrated graphically in Fig. 7.4, with a hint towards the final 
solution structure. The mass matrix is diagonal, thus its inversion is trivial. The 
stiffness matrix has a banded structure in this case with the bandwidth depending 
on the number of basis functions that are required inside each element. 

A simple centred finite-difference approximation of the second derivative in 
Eq. 7.10 and the following mapping 


u””’ — u(t+ dt) 
u > u(t) (7.14) 
ld 


u’“ > u(t-dt) 


leads us to the solution for the coefficient vector u(t + dt) for the next time step 
as already well known from the other solution schemes in previous chapters: 


u”™” = de? [M71 f-K u)] +2u -u% (7.15) 


It is important to note that so far nothing differs from the classic finite-element 
approach. The problems that remain to be solved are finding appropriate basis 
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Fig. 7.4 Symbolic and mathematical 
representation of the global system that 
has to be solved. The unknown acceler- 
ation 02u is found by a simple tempo- 
ral finite-difference approximation. The 
solution requires the inversion of mass 
matrix M which ts trivial because it is 
diagonal. This is the key feature of the 
spectral-element method. 
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Fig. 7.5 In order to facilitate the cal- 
culation of the space-dependent integrals 
we transform each element onto the stan- 
dard interval [—1, 1/, illustrated here for 
n, = 3 elements. The elements share the 
boundary points. 
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Fig. 7.6 Illustration of local basis func- 
tions. By defining basis functions only 
inside elements, the integrals can be eval- 
uated in a local coordinate system. The 
graph assumes three elements ne = 3 
with equal size h = 2. A finite-element- 
type linear-basis function (dashed line) 
is shown alongside a spectral-element- 
type Lagrange polynomial-basis function 
of degree N = 5 (solid line). Com- 
pare with the linear local-basis functions 
introduced in Chapter 6 on the fintte- 
element method (Fig. 6.4). 
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functions and integration schemes to efficiently calculate mass matrix, stiffness 
matrix, and forces. 


7.4 Getting down to the element level 


The solution of the elastic wave equation given in Eq. 7.9 is of a global nature, 
that is, « represents the complete physical domain. One level of discretization 
came through the approximation of the unknown field uw by a finite sum over 
some basis functions g;. However, a further level of discretization is required to 
facilitate the final solution. We divide the domain D into subdomains D, (which do 
not need to be of the same size). This is illustrated in Fig. 7.5 for an example of a 
1D spatial domain D divided into n, = 3 subdomains (the elements). This allows 
the introduction of discontinuities in material parameters, leading to discontinuity 
of the displacement gradient Vu. This step, without further specifying the basis 
functions ¢;, leads to 


aust) | pees adeiGsdas 


e=1 De 


Np Ne 
HY uOD f ucraeooagerar 
i=1 


e=1 De 


(7.16) 


Ne 


=> | 


e=1 De 


Q; (x)f (x, t)dx, 


representing a linear system of N, equations for each j. 

It is important to note that the coefficients u; depend on a sum over all 
elements. To avoid this global dependence we introduce (as was the case in 
the classic finite-element method) basis functions that are only defined on the 


subdomains D,. This is illustrated in Fig. 7.6. Here a linear-basis function is com- 
pared with a spectral-element-type non-linear-basis function. Mathematically this 
locality has important beneficial consequences. Instead of defining basis functions 
in D (as is the case in pseudospectral methods) we now restrict them to reside 
inside the elements D,. We thus end up with the approximation 


U(x; t) 


Np 
= vgs), (7.17) 


xEDe i=1 


where N, denotes the number of basis functions for polynomial order N to be 
summed up.® 

As a consequence, the integrals are now local to one specific element D, and 
to obtain the solution inside we need only to sum over all basis functions with the 
appropriate coefficients: 


Np 
EO) / pOdesCde%(dx 
1=1 De 

Np 


+ uO / (0) B06 (0) 8x98 (x) de (7.18) 
De 


i=1 


= i eof Dax. 


De 


As was the case for the global system in Eq. 7.9, we can use matrix notation to 
obtain 


M’d?u(2) + Ku’ (@) =f), @=1,...5M. (7.19) 


Here u’, K’, M’, and f° are (1) the coefficients of the unknown displacement inside 
the element, (2) stiffness and (3) mass matrices with information on the density 
and elastic parameters, and (4) the forces, respectively, and 7, is the number of 
elements. Matrix—vector multiplications are implicit.’ 

To facilitate the mathematical operations under the integrals it is useful to map 
the spatial coordinates of an element to a reference interval. In principle this can 
be any interval but for our specific choice of basis function we use the interval 
(-1, 1] as indicated in Fig. 7.5. If we want to integrate an arbitrary function in 
our reference interval [-1, 1] we have to apply a coordinate transformation from 
our global system x € D to our local coordinates which we denote é € F,. This 
transformation can be written as: 


F,: (-1, 1] =? D5 x = F.(), 
&=&(x) = F'(n), @=1,...5%s (7.20) 


where 7, is the number of elements, and & € [-1, 1]. Thus the physical coordinate 
x can be related to the local coordinate & via 
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® It turns out that for the specific basis 
functions we use order N will require N, = 
N + 1 functions to be summed. 


7 The sizes of the elemental vectors and 
matrices in this system are: 


u’ > N, 
K’ > N, x N, 
M > N, x N, 
f > N,; 


where N, is the number of basis functions 
inside the elements. 
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8 Germany-born Carl Jacobi (1804— 
1851) was considered a mathematical 
wunderkind and one of the greatest math- 
ematicians of the nineteenth century; he is 
mostly known for his contributions to the 
theory of elliptic functions. 


1 
x6) = Fe@ =h EZ” +x, (7.21) 


where x, is the coordinate of the left side of the element (see Fig. 7.5) and h, is the 
element size. The length /, may vary for each element, allowing the adaptation of 
computational meshes. The inverse mapping is given as 


(x- Xe) = 


&(x) =2 


1. (7.22) 


€ 


From Eq. 7.18 we expect to have to solve integrals of products of basis functions, 
their derivatives, and elastic parameters. A coordinate change x — & leads to 


1 
dx 
[ Foode= / Ne Fae, (7.23) 


where the integrand has to be multiplied by the Jabobian® 7 defined as 
eer a (7.24) 


The inverse Jacobian is also required, when derivatives of the basis functions need 
to be integrated, thus 


ob GE 8 
ee Fes % (7.25) 
Finally, we can assemble our system of equations inside each element as 
Np 1 d 
y aPus(0) / p [x6] v5 Eo] of ExC@)] SE dé 
i=1 “1 
= dé \? dx 
. pe We) i we [x(€)] deg? [x(E)] de? [x29] ($) et (7.26) 
a -1 
1 
dx 
= f of eri s lorena ae. 


-1 


Note that this is a system of N, equations for each index j corresponding to one 
particular basis function for which the wave equation needs to hold. Equation 7.26 
is a semi-discrete weak form of the elastic wave equation for one element only, as we 
have not yet discretized the time axis. What remains to be done is to find a choice 
of basis functions gy; and a numerical integration scheme such that the calculation 
of the integrals, and the assembly and solution of the final global system, become 
as efficient and accurate as possible. 


7.4.1 Interpolation with Lagrange polynomials 


It is about time we disclosed what basis functions we will be choosing to approx- 
imate (i.e. ‘interpolate’) our unknown displacement field and why. Remember 
we seek to approximate u(x, ft) by a sum over space-dependent-basis functions 9; 
weighted by time-dependent coefficients u; (2): 


Np 


u(x,t) ¥ (x,t) =) ui(d) gil). 


i=1 


(7.27) 


As interpolating functions we finally choose the Lagrange polynomials’ and use 
€ as the space variable representing our elemental domain: 


Aner é 
g > (™e=T] De A De NAN: (7.28) 
igi EO 


where &; are (in general arbitrary, separated) fixed points in the interval [-1, 1]. 
Let us look at this definition in more detail. Writing the sum explicitly we obtain 


§-§ §-& E-En €-Eny1 


ee) = if ‘ 
2 ©) 6-61 6: -& = & -En & — Env 


(7.29) 


which is well defined as long as k ¥ 7 since only separate points are allowed. We 
further observe that for specific points &; 


OMe) = gi - 81 o3 5 — 8 P. Gj Ens _ 0 7.30 
igi &)) &-& §& -§& & -Eny ve 
and 
N+1 
ee) _ Il gi § ={ (7.31) 
ral in 8) 


We have just demonstrated the orthogonality of the Lagrange polynomials, which 
can be expressed as 
N 
ee &) = by, (7.32) 
where 6, is the Kronecker symbol, which is 1 if 7 = 7 and 0 otherwise. 

If you cannot picture the Lagrange polynomials by looking at the above equa- 
tions you are probably not alone. But before plotting them, we need to resolve 
another issue. Which points in the &; interval [-1, 1] should we choose for our 
simulation scheme? We use the so-called Gauss—Lobatto—Legendre!® (GLL) points 
(see Fig. 7.7), a choice for which there are several reasons. 
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° The first appearance of the spectral- 
element method was with Chebyshev poly- 
nomials. However, this choice of basis 
function does not lead to the diagonal mass 
matrix that makes inversion so efficient! 


10 Did Gauss, Lobatto, and Legendre 
ever have a drink together? Probably 
not, but Mozart and Haydn did get to- 
gether regularly for jam sessions at around 
that time in Vienna. Gauss (1777-1855) 
lived in Gottingen, Legendre (1752-1833) 
in Paris, and Lobatto (1797-1866) in 
Amsterdam. 
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Fig. 7.7 Gauss—Lobatto—Legendre po- 
ints. Illustration of their spatial distri- 
bution in the interval [—1, 1] for polyno- 
mial order N = 2 to N = 12 (from bottom 
to top), corresponding to N,=N + 1 
collocation points. The distribution of 
points is symmetric around the origin. 
Note the decreasing distance between 
collocation points towards the element 
boundaries (compare with Chebyshev 
polynomials). 
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First of all, with this set of points the basis functions are defined such that 
e &) =1 and ae &,) = 0. (7.33) 
This implies 
(Me) |<1, €eHt,, (7.34) 


minimizing the interpolation error in between the collocation points due to nu- 
merical inaccuracies. The densification of points towards the boundaries avoids 
overshooting of the interpolated function near the boundaries (similar to the 
Gibbs phenomenon discussed in Chapter 5 on the pseudospectral method). An- 
other important aspect is that an integration scheme exists (with the same name) 
that uses precisely this point set, leading to a diagonal mass matrix. 

Fig. 7.8 illustrates N + 1 Lagrange polynomials of degree N = 2 and 6. The 
order N is equivalent to the number of intervals inside each element. This implies 
that with N = 2 we would obtain a spectral-element discretization with even 
spacing of collocation points. Note also that with N = 1 we exactly recover the 
classic linear finite-element scheme! As the order increases, the difference between 
the distance of collocation points increases in a linear way (see Fig. 7.7). The 
GLL points are the roots of the first derivative of the Legendre polynomials Ly 
of degree N (definition below). 

Let us look at an example of function interpolation using Lagrange polynomi- 
als. In Fig 7.9 we approximate a known function (here a sum over sine functions) 
by Lagrange polynomials of various orders. According to 


N+1 


w(E) = Dou ei) (7.35) 


i=1 


the function approximation is given as a sum over N + 1 polynomials weighted 
with the values of the function at the collocation points &;. The superscript (N) of 


the Lagrange polynomials is omitted from now on as the order is indicated by the 
summation limits. This equation provides the reason why it makes sense to call 
the coefficients u; as they correspond exactly to the continuous function uw at the 
collocation points. Note the decreasing misfit between the approximation and the 
original function for increasing order in Fig 7.9 (see also exercises). The accuracy 
obviously increases with order. However, it is important to note that this does not 
mean the highest possible orders should be used for the final algorithm. A strategy 
will be discussed when we assemble the complete spectral-element algorithm. 

With the definition of Eq. 7.28 we are now able to express the general finite- 
element system Eq. 7.26 with our choice of basis function to obtain 


N+1 


7 dx 
2, : : peg 
» AO / POGOWEO TA 
N+1 , 1 dé 2 dx (7.36) 
+2 / HG) GE) 956) (=) “et 
: dx 
_ / OED EE. 


Again, it is worth pausing for a moment and clarifying the structure of this sys- 
tem of equations. We have equations for each index j = 1,...,N +1 while the 
summation is over 7 = 1,..., N+ 1. Everything is known (the analytical Lagrange 
polynomials, the models of density and shear modulus); only the displacement 
and acceleration values wu; and d7u; are unknown. 

To simplify notation we use the following mapping for density p, elastic 
constant jz, and forces f: 


PE): =plxS], uw) =uleEO], fE):=flxE]. (7.37) 


In practice that means we initiate the density and shear moduli at each colloca- 
tion point. Note that assuming constant density and shear moduli inside elements 
would simplify the problem here, since we could take them out of the integrals 
and solve the integrals analytically. However, the fact that we can allow the geo- 
physical parameters to vary smoothly inside elements is an attractive feature for 
seismic wave-propagation problems. As a consequence, the integrals in Eq. 7.36 
unfortunately cannot be evaluated analytically, because we want to keep this flex- 
ibility for strongly heterogeneous models. Therefore, we are forced to resort to 
numerical integration, the topic of the next section. 


7.4.2 Numerical integration 


As illustrated above we have to find an efficient way of solving the integrals over 
the element domain D, given in Eq. 7.36. Like finding the derivatives of a function 
numerically, the approximation of integrals is a vast field of its own. A fundamen- 
tal principle that is often applied is the concept of replacing the function to be 
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Fig. 7.8 Lagrange Polynomials. Top: 
Family of N + 1 Lagrange polynomi- 
als for N = 2 defined in the interval 
€€/- 1,1]. Note their maximum value 
over the whole interval does not ex- 
ceed unity. Bottom: Same for N =6. 
The domain is divided into N inter- 
vals of uneven length. When using La- 
grange polynomials for function interpo- 
lation the values are exactly recovered at 
the Gauss—Lobatto—Legendre collocation 
points (squares). 
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Fig. 7.9 Interpolation with Lagrange 
Polynomials. The function to be approx- 
imated is given by the solid lines. The 
approximation is given by the dashed 
line exactly interpolating the function at 
the GLL points (squares). Top: Order 
N = 2 with three grid points. Bottom: 
Order N = 6 with seven grid pints. 


integrated f(x) by a polynomial approximation that can be integrated analytically. 
A well-known example is the classic Gauss quadrature. It can be shown that a 
degree 2N + 1 polynomial can be integrated exactly with only N + 1 collocation 
points. The problem is that the corresponding collocation points lie inside the el- 
ement (and not at the boundaries). The requirement that the boundaries of the 
integrals are included leads to the so-called Gauss—Lobatto—Legendre quadrature 
(well, we heard these three names before) using the GLL points for integration. 

As interpolating functions we use again the Lagrange polynomials and obtain 
the following integration scheme for an arbitrary function f(x) defined in the 
interval x € [-1, 1]: 


i 1 N+1 
/ f(x)dx * / Py (x)dx =~ wf (xi), (7.38) 
si s i=l 
with 
N+1 
Pr@=)> (GOO): (7.39) 
i=l 
and the integration weights are calculated with 
1 
W; = f &) (x) dx. (7.40) 


-1 


Examples of these integration weights are given in ‘Table 7.1. Note the de- 
creasing values towards the element boundaries compensating for the narrowing 
intervals. For the numerical integration the same principle applies as was the case 
for the interpolation scheme. The higher the order, the more accurate the integra- 
tion is (see exercises). Another point is the specific form of the integrand f(x). In 
principle, the smoother this function is, the more accurate the approximation; that 
is, the faster the numerical result converges to the exact solution as the polynomial 
order increases. 

Let us take an example illustrating how numerical integration works. We 
initialize an arbitrary function, here a sum of sinusoidal functions 


5 ; a 
fei= Ysin(=s +a) (7.41) 


i=1 


with a = [0.5, 1,-3,-2,-5, 4], which can be easily integrated analytically. Using 
the GLL weights from Table 7.1 we integrate this function numerically for vary- 
ing order N and compare with the analytical solution. Examples are shown in 
Fig. 7.10. This is a fairly tough problem for the interpolator, but a decent result is 
obtained with integration order N > 6. It is remarkable how different the approxi- 
mation is compared to the original function, yet the numerical integral calculation 
is very accurate. 


At this point we can draw an important conclusion: Within the realm of the 
polynomial representation of our displacement field, the density, elastic constants, 
and forces, the only error we are accumulating in the spatial domain is by the 
numerical integration scheme just discussed. The only other error in the complete 
spectral-element algorithm comes from the finite-difference approximation of the 
time derivatives. The use of the GLL integration scheme given in Eq. 7.38 to 
evaluate the integrals in Eq. 7.36 has important consequences. We now introduce 
another level of discretization by replacing the continuous integration over the 
elements by a sum over (again) N + 1 weighted functional values located at our 
well-known GLL points, the same locations at which we interpolate our unknown 
function uv. With this integration scheme leading to an additional sum over k, we 
obtain at element level 

N+1 dx 
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(7.42) 


E=Ep 


E=Ep 


We can make use of the cardinal interpolation property of the Lagrange poly- 
nomial £) (a;) = 6, to arrive at the solution equation for our spectral-element 
system at the element level using matrix notation: 

N+1 N+1 


> Miu t+ >) KiQ =O,  e=1,...5m (7.43) 
t=1 


i=1 


. ax 
Mj; = mip S) Fe Si zee 
N+1 d. 2 d 
Ki = D> wap) de; (Ede £s(€) () — (7.44) 
k=1 E=Ep 
dx 
fawfEo—| 
J J. dé as 


To some extent the above three equations for mass matrix, stiffness matrix, 
and sources constitute the core of the preparatory computations that need to 
be done before starting the time extrapolation for the wavefield.'’ But wait a 
minute! There is one item that needs to be considered further. Note that to 
calculate the stiffness matrix we need to know the derivatives of the Lagrange 
polynomials evaluated at the GLL collocation points. As our intention is to pro- 
vide you with all equations necessary to solve (at least) the 1D problem using 
spectral elements, we will deviate briefly to present how these derivatives are 
calculated. 
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Table 7.1 Collocation points and 1n- 
tegration weights of the GLL quadra- 
ture for order N = 2,...,4. 


N & Qj; 
2: 0 4/3 
+1 1/3 


3: +./1/5 5/6 


+1 1/6 
4: 0 32/45 

+,/3/7 49/90 

+1 1/10 


"Note the 6; in the equation for the 
mass matrix M;, implying a diagonal 
structure. As a consequence, finding its 
inverse, required for the solution of the 


global system of equations, is trivial. 
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Fig. 7.10 Gauss integration. The exact 
function (thick solid line) is approxi- 
mated by a Lagrange polynomials (thin 
solid line) that can be integrated ana- 
lytically. Thus, the integral of the true 
function (thick solid) 1s replaced by an in- 
tegral over the polynomial function (dark 
grey). The difference between the true 
and approximate functions is given in 
light grey. Top: N = 3. Bottom: N=6. 


7.4.3 Derivatives of Lagrange polynomials 


From Eq. 7.26 it is clear that we also need the derivatives of our basis functions, 
that is, each Lagrange polynomial, as they are part of the integrands. Thus we seek 
Gf possible) analytical solutions to 0:¢;(€). Here we present a common scheme 
for its calculation using a recursive formula. 

The GLL integration scheme implies that we do not need to know the deriva- 
tives everywhere in [—1, 1] but only at the GLL collocation points. It turns out that 
these derivatives can be efficiently calculated using Legendre polynomials. They 
are also defined in the interval € € [-1, 1] and are given as: 


IN 


aa iy (7.45) 


In) = 


where N denotes the polynomial degree. The Legendre polynomials can be 
calculated using the following recursive formula: 


Lo(&) = 1 
Ng) =& (7.46) 
Lya2(€) = ~[Qn-1) & Int @)~2-DLn2@)] 


An illustration of the Legendre polynomials is given in Fig. 7.11. Following 
Funaro (1993), the derivatives of the Lagrange polynomials can be calculated 
using: 


N 
Os fai) = D> dle), R=0,...,N, (7.47) 
j=0 
with 
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dy = (7.48) 
0 ifl<t=j<N-1 
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For a spectral-element simulation of a specific order N, a matrix with the deriva- 
tives 0; €,(&;) for each polynomial & at all N+1 collocation points ; is precalculated 
and used to evaluate the integrals (see Matlab/Python routines given in the 
supplementary material). Note that these precalculated derivatives of Lagrange 
polynomials can be used to approximate the derivative of an arbitrary function 
u(é) defined in € € [-1, 1] at the GLL collocation points &;: 


N+1 


dgu’(E) =) u’ Ede e;(€). 


i=1 


(7.49) 


At this point we have almost finished! At least, we have prepared everything 
that we need to express our approximate solutions inside one element. What 
remains to be done is to link the elemental results and formulate the complete 
solution over the entire physical domain. This process is called assembly. 


7.5 Global assembly and solution 


Eqs. 7.44 provided us with solutions of the wave equation inside an element 
without interaction with the outside. To understand how to assemble the global 
solution it is instructive to recall the original global solution we obtained (Eq. 7.9). 
We have done nothing other than to divide up the entire domain D into ne ele- 
ments and describe the solutions at an elemental level. As indicated earlier the 
classic finite- (spectral-) element method assumes continuity of the solution fields 
at the element boundaries. Therefore, we simply need to add up the elemental so- 
lutions at the corresponding boundary collocation points. In the spectral-element, 
method each element boundary thus only has one value.!* Before we present the 
calculation of the global system of equations, let us discuss the dimensions of the 
resulting vectors and matrices, that is, the overall number of degrees of freedom in 
our spectral-element system. 

Comparing with Fig. 7.5 it is straightforward to see that, for a system with ne 
elements and given polynomial order N, the global number of collocation points 
Ng Of Our system is mg = ne x N + 1. This is illustrated in Fig. 7.12 for a physical 
domain of size 10 km and elements of equal size. 

We have derived a diagonal elemental mass matrix, which implies that we can 
store its entries as a vector. Nevertheless we present its global shape for illustra- 
tive purposes in Fig. 7.13, highlighting two of the elemental matrices inside the 
domain that make up the global system. Note that the elemental matrices in gen- 
eral differ from each other as they depend on elastic parameters and density. In 
addition, their Jacobians depend on element size. 

We denote the global matrices (vectors) with subscript g and illustrate their 
mathematical form for a system with nm, = 3 elements and order N = 2 La- 
grange polynomials. As the mass matrix is diagonal we show it in vector form and 


obtain 
Element 1 Element 2 Element 3 Element 4 
T T t T T } T T } T T + 
e e e e e e e e e e e e 
| | | | | i | | 
0 1,000 2,000 3,000 4,000 5,000 6,000 7,000 8,000 9,000 10,000 
x(m) 
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Fig. 7.11 Illustration of the Legendre 
polynomials up to order N = 6. The 
Legendre polynomials are used to calcu- 
late the first derivatives of the Lagrange 
polynomials. They can also be used to 
calculate the integration weights of the 
GLL quadrature. 


2 This is the main difference with the 
discontinuous Galerkin method, where ev- 
ery element has its own boundary values 
and information between elements is trans- 
mitted by flux terms. 


Fig. 7.12 Global GLL collocation pot- 
nts for ne = 4 elements, order N = 4 poly- 
nomials, and a physical domain with 
x € [0; 10,000] m. 
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Fig. 7.13 Global mass matrix. The di- 
agonal structure of the mass matrix 
is illustrated for a 1D example with 
ne=4 elements and order N=4 La- 
grange polynomials. Elemental matri- 
ces are overlapping and summed at the 
global mass matrix entries representing 
the element boundaries (compare with 
Eq. 7.50). 


ne x N+1 


ne x N+ n 


Fig. 7.14 Global stiffness matrix. The 
diagonal structure of the stiffness ma- 
trix 1s illustrated for a 1D example with 
4 elements and order N = 4 
Lagrange polynomials. Elemental ma- 


ne = 


trices are overlapping and summed at 
the global stiffness matrix entries repre- 
senting the element boundaries (compare 
with Eq. 7.51). 


(7.50) 


where the corner entries of the elemental matrices are summed (here every third 
element). The upper indices in brackets denote the elements. Note that we have 
N +1 =3 collocation points inside each element (see Fig. 7.13). 

For the global stiffness matrix we obtain in an analogous way Kg = 


Q) pe) () 
Ky 1 Ky 2 Ky 2 
KY) KY KY 0 
A) efi) 2) p) (2) 
Ky Ky Ky; + Ky Ky 2 Ky 
1S K2 Ke (7.51) 
KEKE KB KY KY KE 
0 K2 KY K® 
2 2 2 
KS? R33 R33 


with each element represented by an (N+ 1) x (N+ 1) matrix. Inside the domain 
the first and last diagonal element are summed up with the value from the adjacent 
element. For a 1D grid, and also with varying element size, the stiffness matrix 
has a banded structure (Fig. 7.14). 

Note that for general irregular grids (hexahedral or tetrahedral) a connectivity 
has to be defined and the stiffness matrix structure can be arbitrarily filled. An 
example of a stiffness matrix for an irregular grid is shown in Fig. 7.15. 

Equivalently, the vector containing information on the source is given as 


(7.52) 


We end up with a system of equations for nz = n, x N + 1 coefficients for the 
displacement u,, where N is the interpolation order and n, is the number of el- 
ements. As illustrated above, the matrices M, and K, have dimensions ng x ng. 
Because of its diagonal structure, M, is obviously never initialized as a matrix. To 
save memory it is stored as a vector of the diagonal elements. The force vector f, 
also has n, elements. The time-dependent coefficients u, are extrapolated with a 
simple centred finite-difference scheme to the next time step ¢ + dr: 
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u,(¢ + dt) = di* [M,"' (f,(t) - K,u,(2)) | 


(7.53) 
+ 2ug(t) —Ug(t— dt). 


In this final algorithm only the coefficients u, are updated as a response to the 
time-dependent forces that are injected through f, at predefined locations after 
the system was at rest at ¢ = 0 (our initial condition). Even though in principle 
mass and stiffness matrices could be modified during the time extrapolation (e.g. 
when the elastic parameters or density are time-dependent or the computational 
mesh is adapted), this case will not be considered here. In many seismological 
applications the Earth model and the mesh remain constant. 

Before turning this algorithm into a computer program let us throw some light 
on the forces that are injected. 


7.6 Source input 


In many research applications in seismology it is sufficient to treat the seismic 
source as acting at a single point. In addition, any finite source representation can 
be obtained by summing over many point sources according to the superposition 
principle. What happens if we activate a force at a single collocation point inside 
(or at the edge of) an element? This is illustrated in Fig. 7.16. As is the case in 
the finite-difference method, injecting a source at a single collocation point is not 
a problem. In fact, due to the Galerkin approach the integral correctly represents 
a delta-function. 

If the source is not located directly at a collocation point, a possible solution 
is to use (smooth) spatially limited functions to spread the point source to adja- 
cent grid points. Note that, depending on its spatial extension, this may lead to a 
low-pass filtering of the injected source time function. The injection of sources 
is described in more detail in Fichtner (2010) for both spectral-element and 
finite-difference methods. In the vicinity of the point sources the solution may be 
erroneous as the near-field terms are not properly represented. This is discussed 
in Nissen-Meyer et al. (2007). However, the authors note that the wavefields are 
accurate when records are taken more than two elements away from the physical 
source location. 


7.7 The spectral-element method in action 


7.7.1. Homogeneous example 


We can now proceed to implement Eq. 7.53 in a computer program and illus- 
trate some of the spectral-element-specific features with code fragments written in 
Python. Compared to the finite-difference method the preparatory steps for the 
spectral-element extrapolation are substantially more involved, even though the 
final extrapolation is very similar. Thus we illustrate the workflow schematically 


Fig. 7.15 Graphical illustration of a 
sparse global matrix for the case of an 
irregular mesh, The non-zero entries are 
shown in black. 
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Fig. 7.16 Illustration of the seismic 
source input into a spectral-element al- 
gorithm. Top: A point source represen- 
tation at a collocation point (black bar). 
Bottom: Sources not collocated with 
collocation points (open square) can be 
input by spreading them with appropri- 
ately scaled spatially limited functions 
(here, a Gaussian). 


13 Tt is worth noting (again) that this 
particular result is one of the keys to the 
success of the spectral-element method. 


in Fig. 7.17 (which, with a few alterations, is also representative of the nodal 
discontinuous Galerkin method). 

For any finite-element-type method, the calculation of the system matrices 
(stiffness and mass) constitutes the most important preparatory step before the 
time-loop is started to extrapolate the initial conditions. The following code 
snippet shows that the mathematical developments described above lead to an 
extremely dense algorithm for the calculation of the elemental stiffness and mass 
matrices: 


# Elemental Mass matrix 
# stored as a vector since it’s diagonal 
Me = zeros (N+1) 
for i in range(0,N+1): 
Me[i] = rho * wli] * J 
# [...] 
# Elemental Stiffness Matrix 
Ke = zeros([N+1,N+1] ) 
for i in range(0,N+1): 
for j in range(0, N+1): 
for k in range(0, N+1): 
Ke[i,j] = Keli,j] + mustw[k] *Ji*lid[i,k] 
*1l1d[j,k] 


Here the elemental mass matrix Me is initialized in vector form because of its 
diagonal structure. We integrate density multiplied by the product of our basis 
functions (i.e. unity). Because of their orthogonality the resulting 6-function leads 
to this extremely simple formulation.!* In this case the parameters jz and p are 
constant inside the element. 

The integration weights w are initialized from precalculated tables (see 
Table 7.1). In 1D the Jacobian ¥ (and its inverse #7) is a scalar and in this ex- 
ample kept constant (i.e. all elements have the same size). This can easily be 
changed to allow space-dependent element size, one of the most attractive features 
of element-based techniques. 

The elemental stiffness matrix Ke is calculated by integrating over the elastic 
coefficients mu (shear modulus) and the product of the derivatives of the basis 
functions (initialized here through a function /1d implementing Eqs. 7.47 and 
7.48), multiplied by the inverse Jacobian (one inverse Jabocian and one Jacobian 
cancel each other out in the 1D case). 

As indicated in Eq. 7.44, the spatial source function also has to be projected 
onto the basis functions and integrated. In general, the force vector will vary in 
each element (zero except for the source point or region). Therefore, it is initial- 
ized as a matrix of size ne x N + 1 with ne the number of elements and N the 
polynomial order. The matrix s represents the spatial source function, which can 
contain several source locations (e.g. a finite source) to be injected at collocation 
points. The code fragment below describes the force initialization in the general 


The spectral-element method in action 201 


Global SEM Matrix Time 
initialization initialization initialization extrapolation 


Space-time Integration Elemental mass Source time 
domain weights matrix function 


Polynomial Collocation Elemental 
order points stiffness matrix 


Source location, 
source time Jacobian 
function 


Global matrix 
assembly 


Earth model 
(pH) 


case with an example of a point source input in the central element ve/2 at the 
first collocation point (1.e. at the boundary between element me/2—1 and ne/2). 


# Initialization 
fe = zeros ((ne,N+1) ) 
s = zeros((ne,N+1) ) 
# Point source 
s[int(ne/2),1] = 1 
# Force vector 
for k in range(ne) : 

for i in range (N+1): 

fe[k,i] = s[k,i] * w[i] * J 


Before the time extrapolation, the elemental force vectors are assembled to a 
global form (see Section 7.5) and reshaped to a vector with n, = n, x (N + 1) 
elements. 

As an example of a global matrix assembly we present a code fragment initial- 
izing the global stiffness matrix K of size ng x ng from elemental stiffness matrices 
Ke of size N+ 1x N+1. Here, 70 and 0 are the elements in the global matrix from 
which the elemental stiffness matrices are added. Note that here we assume equal 
stiffness matrices for each element. It is straightforward to make these elements 
dependent (see exercises and supplementary material). 


# Global Stiffness Matrix 

K = zeros([ng,ng] ) 

# Values except at element boundaries 
for k in range(1,ne+1): 


Fig. 7.17 Schematic workflow of the 
spectral-element (SEM) solution of the 
elastic wave equation. A substantial part 
consists of preparing the interpolation 
and integration procedures required to 
initialize the global mass and stiffness 
matrices. The final time extrapolation 1s 
extremely compact and does not require 
the inversion of a global matrix (whereas 
classic finite-element methods do). 
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Table 7.2 Spectral-element 
simulation, homogeneous case. 


Parameter 


Xmax 


ne 


Us 
p 
N 
€ 


Value 


10 km 

250 

2,500 m/s 
2,000 kg/m? 
2-8 

0.8 


10 = (k-1) * N41 

j0 = io 

for i in range(-1,N): 

for j in range(-1,N): 
K[i0+i,j30+j] = Ke[i+1,j+1] 

# Values at element boundaries 
for k in range(2,ne+1): 

i0 = (k-1) * N 

j0 = io 

K[i0,j0] = Ke[0,0] + Ke[N,N] 


Following the assembly of the global matrices as described in the previous section, 
the vector of coefficients « (which does in fact correspond to the displacement 
values due to the clever choice of our basis functions) can be extrapolated to the 
next time step unew by a simple finite-difference approximation. Here, for illus- 
trative purposes, Minv (inverse mass matrix) is initialized as a diagonal matrix, f 
contains the forcing, and wold the coefficients at the previous time step. 

Remapping of the coefficient vector allows the subsequent extrapolation to 
the next time step in each iteration step using implicit matrix—vector operations. 
Here, nt is the global number of time steps that depends on the desired seis- 
mogram length and the minimum distance between collocation points and the 
choice of elastic parameters. The various other initializations that have to be done 
prior to the simulation are similar to the other numerical techniques presented in 
the previous chapters, and so are not further illustrated here (see supplementary 
electronic material). 


# [...] 

# Time extrapolation 

for it in range(nt): 

# Extrapolation 
unew = dt**2 * Minv @ (f - K @u) +2 * u - uold 
uold, u = u, unew 


# [...] 


Note that, as illustrated already in Chapter 6 on the finite-element method, this 
matrix—vector extrapolation scheme is formally identical to the finite-difference 
method, if the global stiffness matrix is replaced by a scaled finite-difference 
operator. 

Finally, let us analyse examples with numerical solutions of the wave equation 
solved with the above algorithm (Eq 7.53). The parameters of a simulation in 
a homogeneous material are given in Table 7.2. The source time function has a 
dominant period of Tjom = 0.15 s and is initialized by 


(19)? 


s(t) =-2a(t-tp)e 7 (7.54) 
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Error: 0.051 % (N = 4) 


Error: 6.567 % (N = 2) 
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with a = 4/T om. This is the first derivative of a Gaussian. The receiver is at a 
distance of 3 km from the source. In Fig. 7.18 we compare the numerical solution 
(dashed line) with the analytical solution (black line) and calculate the relative 
energy misfit in % as a function of the polynomial order used in the simulation, 
while everything else is kept constant. 

The spectral-element algorithm as implemented above allows us to change the 
spatial accuracy with only one parameter N without any further modification. 
However, note that, assuming constant stability criterion € = 0.8, the time step dt 
is decreasing with increasing order as the minimum distance between collocation 
points is the decisive factor. This effect was discussed at length in connection with 
the Chebyshev pseudospectral method. The decreasing time step with polynomial 
order actually prevents us going to much higher orders. In realistic simulations 
(e.g. using specfem3d) the highest order is usually N = 4. 

The results shown in Fig. 7.18 indicate how the simulation with polynomial 
order N = 2 (corresponding to a regular grid) would not lead to a sufficiently 
accurate solution for the chosen set-up. It is instructive to see how the solution 
accuracy dramatically improves when the order of the interpolation (and integra- 
tion) scheme increases—of course at the cost of longer simulation time. This is 
due to the increase in the number of floating point operations per time step and 
the increasing number of time steps due to the decrease of smallest grid-point 
distance. 

In many cases one aims to have an overall error of the numerical solution of 
<1%. It is interesting to analyse the behaviour of the numerical solutions for this 
simple case as a function of the spectral-element-specific parameters that enter the 
algorithm and compare with the other methods discussed in this volume. This 


203 


Fig. 7.18 Spectral-element method, ho- 
mogeneous medium. Comparison of nu- 
merical (dashed) and analytical (black) 
solution and their differences (dotted) of 
the 1D spectral-element implementation. 
The order of the interpolation (and in- 
tegration) scheme is given at the top of 
the figures together with the overall en- 
ergy misfit in %. Note the improvement 
of the numerical solution with increasing 
polynomial order N. 
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Table 7.3. Spectral-element 
simulation, heterogeneous case. 


Parameter Value 

Nags 10 km 

ne 250 

aes 2,500 m/s 
pe 2,000 kg/m? 
N 2-12 

€ 0.5 

Wiener 0.12 s 


4 In practice we initialize a vector with 
random numbers, low-pass filter with the 
desired corner wavelength (here 500 m), 
and scale with the perturbation amplitude 
(here 25%). 


Fig. 7.19 Spectral element method, het- 
erogeneous case. Top: Shear velocity 
model with smooth random perturbation 
(solid line). The spatial source function 
is indicated by a dashed line. Middle: 
Displacement wavefield at t = 1.7 s fora 
simulation with order N = 2. Bottom: 
Simulation with order N = 4. 


can be done using the extensive electronic supplementary material. In the ho- 
mogeneous case we are fortunate to have an analytical solution to compare with. 
This is not the case for general heterogeneous models. The next section presents a 
heterogeneous model and discusses a strategy for benchmarking complex models. 


7.7.2 Heterogeneous example 


In the spectral-element method the parameters 4 and jz can vary at each col- 
location point. However, beware what this really means. Similar to the problem 
discussed in the previous section, any function that is described on the colloca- 
tion points is multiplied by the Lagrange polynomials. This implies that a sharp 
discontinuity is replaced by a smooth representation and the Gibbs phenomenon 
occurs (explore this with the codes given in the supplementary material). 

From an algorithmic point of view the extension to the heterogeneous case 
is straightforward. Mass and stiffness matrices have to be initialized separately 
for each element, with 4 and yw varying at each collocation point. That means, 
when increasing the order of a scheme, the space-dependent parameters are 
interpolated in a slightly different way even though the results should converge. 

We demonstrate this convergence behaviour with a random 1D velocity model. 
The parameters for the simulations are shown in Table 7.3. The constant velocity 
model is perturbed with a random perturbation*. The resulting velocity model is 
shown in Fig. 7.19 (top). 

A source is injected at the centre of the model with a dominant period 
Tom = 1.2 s (see Eq.7.54). The wavefield propagated in both directions through 


3,000 


2,500 


Ug (m/s) 


2,000 


1 1 1 1 1 1 1 i 
0 1,000 2,000 3,000 4,000 5,000 6,000 7,000 8,000 9,000 10,000 
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the random model and is captured after a simulation time of t = 1.7 s in Fig. 7.19 
for orders N = 2,4. The parameters have been chosen such that we can illus- 
trate the improvement of the solution with increasing order. In the figure the 
direct wave has been clipped so that the effect of the accuracy improvement be- 
comes more visible. The simulation with order N = 2 shows strongly dispersive 
behaviour. Increasing to order N = 4 reduces substantially the high-frequency 
numerical noise. 

But still ... how do we know our result is correct? This is a situation that 
always occurs when simulating heterogeneous models. One possible approach 
is to gradually increase the order (accuracy) of the scheme to see whether the 
results still change. A trick often used is to carry out one simulation at very high 
resolution (which might be expensive in the 3D case) and use this as a reference 
solution. That is fine if you are absolutely confident that there is no bug in your 
code. An alternative is to compare your results with other codes, possible using 
other numerical methods. Projects with various benchmark models are discussed 
in the Appendix. 

The convergence behaviour in our heterogeneous example is documented in 
Fig. 7.20 for orders N = 2,...,12. The root-mean-square (rms) difference to 
the simulation with order N = 12 is shown as a function of order. At the same 
time we record the elapsed time of the extrapolation part. Visually the results do 
not change much for orders N > 4 but the error steadily decreases. This comes 
at the expense of a steady increase in computation time (here with respect to the 
simulation for N = 2). This is because, while keeping everything else constant, 
(1) the number of grid points and thus the floating-point-operations per time step 
increase, and (2) due to the decreasing minimal grid distance, the time step is de- 
creasing accordingly. This effect is of course even more pronounced in 2D or 3D. 


7.8 The road to 3D 


The power of the spectral-element method lies in its efficient extension to 2D and 
3D while keeping the explicit time marching scheme possible through the diago- 
nal mass matrix when using hexahedral grids. Curved elements do not cause any 
problems, allowing efficient and accurate implementation of curved free surfaces 
(rather than approximating them as straight-line segments, see Fig. 7.21). In this 
section we merely give hints as to which articles can be useful in gaining an un- 
derstanding of the spectral-element concepts applied to 3D. Many more articles 
using this method for research are presented in Chapter 10 on applications. 

The basic spectral-element algorithms using Lagrange polynomials for 2D or 
3D elastic wave propagation were presented in Komatitsch (1997), Komatitsch 
and Vilotte (1998), Komatitsch et al. (1999), and Komatitsch and Tromp (1999). 
These studies also document improvements using high-order time-extrapolation 
schemes (e.g. the Newmark scheme). The solution to fluid-solid media—highly 
relevant for marine exploration problems or global wave propagation—was pre- 
sented by Komatitsch et al. (20000). Further extensions include anisotropic 
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Fig. 7.20 Convergence with increasing 
order. Convergence behaviour of the sim- 
ulation shown in Fig. 7.19. The left axis 
shows the decrease of the rms error (%) 
as the polynomial order of the scheme in- 
creases. In the right axis the increase of 
the elapsed simulation time (for the time 
extrapolation part only) is shown. 


Fig. 7.21 In 3D, elements might be 
skewed and have curved boundaries. In 
analogy to the 1D case, curved hexahe- 
dra are mapped to a reference interval 
through the Facobian transformation. 
Figure courtesy of B. Schuberth. 
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Fig. 7.22 Spectral elements with tetra- 
hedra. Two examples of a high-order 
nodal  spectral-element discretization 
with tetrahedra. The mass matrix is no 
longer diagonal and has to be inverted 
numerically. From May et al. (2016). 


media (Komatitsch et al., 2000a), and poroelastic media (Boxberg et al., 2015). 
Further verification against analytical solutions was presented by Martin (2011). 

The great success of the spectral-element method is also linked to the fact that 
in combination with the cubed-sphere approach it is currently the only production 
code for 3D global wave propagation. This was made possible through the work 
of Chaljub et al. (2003), Komatitsch and Tromp (2002a), and Komatitsch and 
Tromp (20020), with many subsequent applications. 

Regular-grid spectral-element implementations also have their merits, in par- 
ticular when Earth models with moderate and smooth velocity perturbations are 
the target. Detailed algorithms for the case of wave propagation on a regional 
scale were presented in Fichtner and Igel (2008), Fichtner (2009), and Fichtner 
(2010). Further attractive applications are the axisymmetric approach for global 
wave propagation of Nissen-Meyer et al. (2007) and Nissen-Meyer et al. (2014), 
with recent online facilities developed by van Driel et al. (20150). 

The spectral-element method is now routinely used for forward problems of 
all kinds Gncluding rupture problems), as well as full waveform inversion. The 
flexibility with hexahedral meshes led to the possibility of generating meshes with 
substantially varying element sizes (i.e. /-adaptivity). However, models with very 
discontinuous behaviour, or extremely complex geometries, still cause problems 
because of the difficulties in generating hexahedral meshes. 

Attempts had been made to extend the spectral-element concepts to triangu- 
lar meshes (Mercerat et al., 2006) and recently to tetrahedral meshes (May et al., 
2016, see Fig. 7.22). The unstructured grids do not allow a diagonal mass ma- 
trix, so that, as in the classic finite-element methods, linear algebra libraries are 
required to solve the global linear system of equations. It will be interesting to see 
how these new methods compare with the discontinuous Galerkin-type methods. 

The problems with finding appropriate meshes with hexahedral element 
shapes motivated the search for methods that (1) are based on tetrahedral (or 
arbitrarily shaped) elements that are easier to adapt to complex geometries, and 
(2) can better handle discontinuities in the solution field or the geophysical pa- 
rameters. These problems are addressed with the remaining approaches; the 
finite-volume and the discontinuous Galerkin methods. 


Chapter summary 


e The spectral-element method combines the flexibility of finite-element 
methods with respect to computational meshes with the spectral conver- 
gence of Lagrange-basis functions used inside the elements. 


e The enormous success of the spectral-element method is based upon 
the diagonal structure of the mass matrix that needs to be inverted 
to extrapolate the system in time combined with the spectral conver- 
gence of the basis functions. Due to the diagonality, no matrix inversion 
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techniques need to be employed, allowing straightforward parallelization 
of the algorithm. 


e The diagonal mass matrix is made possible by superimposing the col- 
location points of both interpolation and integration schemes (Gauss— 
Lobatto—Legendre integration). 


e The errors of the spectral-element scheme accumulate from the (usually 
low-order finite-difference) time-extrapolation scheme and the numerical 
integration using Gauss—Lobatto—Legendre quadrature. 


e In principle, the spectral-element method can also be formulated with 
other basis functions with similar (or even better) interpolation and in- 
tegration properties (e.g. Chebyshev polynomials). However, then the 
mass matrix is not diagonal and a global system matrix needs to be 
inverted. 


e Spectral-element solutions are usually formulated for hexahedral compu- 
tational grids. For complex models (surface topography, internal curved 
boundaries) this might involve cumbersome mesh generation. Formula- 
tions for triangles or tetrahedra are in principle possible but the advantage 
of a diagonal mass matrix is lost. 


e The spectral-element method is particularly useful for simula- 
tion problems where an uneven free surface plays an important 
role, and/or in which surface waves need to be accurately mod- 
elled. The reason is that the free-surface boundary is implicitly 
solved. 


© Several well-engineered community codes are available for Cartesian and 
spherical geometries including basin scale, continental scale, and global 
Earth (or planetary scale) calculations. 


FURTHER READING 


e Fichtner (2010) provides further mathematical details on the spectral- 
element method (interpolation and integration schemes) and discusses 
forward and inverse problems in 3D. 


e Pozrikidis (2005) is perhaps the most exhaustive book on the spectral- 
element method, with many examples provided using Matlab. The mathe- 
matical background is explained in great detail. 


e Peter et al. (2011) give an excellent review of the capabilities of the 
specfem3d code family for both forward and inverse modelling. 
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EXERCISES 


Comprehension questions 


(7.1) 


(7.2) 


(7.3) 


(7.4) 


(7.5) 
(7.6) 


(7.7) 


(7.8) 


What is the main difference between classical finite and spectral-element 
methods. What is the meaning of spectral in this context? 

What is the free-surface boundary condition? Explain qualitatively why 
this boundary condition is implicitly fulfilled in finite- (and spectral-) el- 
ement methods. Which problems in seismology might benefit from this 
behaviour? 

The spectral-element method allows in principle arbitrary high-order 
polynomials inside the elements. Can you give a reason why in practice 
only low-order polynomials (usually N < 4) are used, even for large 
simulations with long propagation distances? 

Why can sin and cos functions not be used within the spectral-element 
framework, given that they are so efficient for the pseudospectral method? 
Explain the concepts of weak form and strong form of the wave equation. 
What is meant by exact interpolation at collocation points? Does it mean 
the solution is exact everywhere inside an element? 

Do you know how the mass and stiffness matrices got their names? Hint: 
This has to do with the field in which the finite-element method was 
developed. 

Compare finite-difference and spectral-element methods in terms of their 
potential domains of application in the field of seismic wave propagation. 
Give arguments. 


Theoretical problems 


(7.9) 


(7.10) 


(7.11) 


In the spectral-element method each element has N + 1 collocation points 
including the boundaries, where N is the polynomial order. Derive the 
equation for ng the global number of degrees of freedom (i.e. collocation 
points) in the 1D case for a problem with ve elements. 

We want to find a set-up for a simulation task. Assume that you want to 
propagate 10,000 km with velocity of c = 5 km/s. The stability criterion is 
given by cdt/dx < 0.5. Assume that 10 points per wavelength are enough 
to achieve sufficient accuracy. The dominant frequency of your wavefield 
is 0.2 Hz (i.e. period 5 s like crustal surface waves). The elements are 
discretized by Gauss—Lobatto—Legendre points. Examples are given in 
Table 7.1. Calculate the required number of spectral elements and the 
time steps for orders N= 2,3, and 4. How many time steps would you 
roughly expect for each simulation? 

The Lagrange polynomials of order N are given by 


(7.12) 


(7.13) 


(7.14) 


Write down all polynomials ie (x) for N = 2 and general points x, with 
k = 1,2, 3. Show that with N = 1 you recover the definition of linear-basis 
functions introduced in Chapter 6. 

The function f(x) = 1/2x* — 1/3x° is defined in the interval x ¢€ 
[0, 1]. Evaluate its integral analytically. Calculate the integral using GLL 
quadrature for orders N = 1-4 (see Table 7.1). Compare analytical and 
numerical results. 

Derive the elemental mass matrix with Lagrange polynomials (Eq. 7.44) 
starting with the general form given in Eq. 7.11. 

Use the recursion formula Eq. 7.46 and derive the Legendre polynomials 
for order N = 0-4. Plot the results in the interval [-1, 1]. 


Programming exercises 


(7.15) 


(7.16) 


(7.17) 


(7.18) 


(7.19) 


(7.20) 


(7.21) 


Use the information on the GLL collocation points in Table 7.1 to 
write a function Jagrange that returns the Lagrange polynomials 7 € 
[0, N] for arbitrary € € [-1,1] where WN is the order (see equation in 
exercise 7.11). 

Define an arbitrary function f(x) and use the lagrange routine of the 
previous problem (or the supplementary material) to calculate the in- 
terpolating function for f(x). Show that the interpolation is exact at the 
collocation points. Compare the original function f(x) and the interpo- 
lating function on a finely spaced grid. Vary the order of the interpolating 
polynomials and calculate the error as a function of order. 

We want to investigate the performance of the numerical integration 
scheme (Gauss integration). Based on ‘Table 7.1, write a program that 
performs GLL integration on the GLL points. Define a function f(x) of 
your choice and calculate analytically the integral f f(x)dx for the inter- 
val [-1, 1]. Perform the integration numerically and compare the results. 
Modify the function and the order of the numerical integration. Discuss 
the results. Note: The error of the spatial scheme in the spectral-element 
method comes only from this integration step. 

Use the 1D spectral-element code (supplementary material) to deter- 
mine experimentally the stability limit as a function of the order N of 
the Lagrange interpolation. 

Increase the order of the scheme and observe the necessary decrease of 
the time step, keeping the Courant criterion constant. 

Modify the spectral-element code to allow for space-dependent elastic 
parameters and density. Introduce a low-velocity zone (—30%) at the cen- 
tre of the model spanning 5 elements. Input the source inside this zone 
and discuss the resulting wavefield. 

Introduce h-adaptivity (each element may have different size /) to the 
numerical scheme by making the Jacobian element dependent. Generate 
a space-dependent mesh size (e.g. decreasing the element size gradually 
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(7.22) 


(7.23) 


towards the centre). Generate a velocity model that keeps the number of 
points per wavelength approximately constant. 

Use the power of the 1D spectral-element scheme to implement a 
strongly heterogeneous computational mesh: (1) a low-velocity zone in 
the middle of the region (source in and outside this region); (2) vary 
the element size using a Gauss function; (3) vary the element size ran- 
domly within some bounds. Document the effect on the solution in 
the homogeneous case. Investigate the effects on the waveforms for the 
heterogeneous case. Make sure you choose the right time step! 

Define an arbitrary function in the interval [-1,1]. Use the available 
GLL and lagrange routines to compare the interpolation behaviour of 
Lagrange polynomials on regular grids vs. GLL points. Plot the energy 
misfit as a function of polynomial order (Runge phenomenon). 


The Finite-Volume Method 


All numerical methods we have encountered so far work (reasonably) well for 
the solution of wave equations. Seismic wave-propagation problems are usually 
characterized by the fact that the solutions are sufficiently smooth, so that, in 
practice, we simulate band-limited wavefields. This holds in most cases even if 
the parameters of the elastic wave equation (i.e. the seismic velocity model) are 
discontinuous. But what if the solution is characterized by discontinuities (e.g. 
shock waves, gravity waves, transport problems, etc. see Fig. 8.1)? This question 
led to the development of the finite-volume method. 

In many problems of physics (e.g. fluid flow, material transport, advection 
problems) that are described by partial differential equations, the initial condi- 
tions (or source terms) contain discontinuities. In terms of spectral content this 
implies that infinite frequencies are part of the solution. We have seen in Chap- 
ter 5 on pseudospectral methods that discontinuities cause problems (the Gibbs 
phenomenon) as soon as we have to limit the wavenumber range of our solution 
(e.g. due to spatial discretization). 

To some extent, the finite-volume method is a way to avoid taking spatial 
derivatives of the solution fields, by replacing them with so-called flux terms. The 
finite-volume method naturally follows from conserving mass (or a tracer con- 
centration, elastic energy, etc.) in a volume cell of an advective system, balancing 
it with the flux into and out of it. This leads to the classic advection equation of 
material transport, and in fact this simple principle can also be used to derive the 
acoustic wave equation (Leveque, 2002). 

The advection equation is an expression of a hyperbolic conservation law that 
is fundamental for many branches of continuum physics. It turns out that the 
seismic wave equation can be cast in this mathematical form. This implies that we 
can transfer results concerning the numerical solution of scalar advection directly 
to problems in seismology. 

The most important advantage of the finite-volume method is the allowance 
of in principle arbitrarily shaped computational cells. Despite this flexibility, 
finite-volume methods have so far not been used for large-scale seismic simu- 
lation problems. However, the flux concepts introduced here are fundamental 
for the understanding of the discontinuous Galerkin method, discussed in Chap- 
ter 9. In fact, in their lowest-order implementations, the finite-volume and the 
discontinuous Galerkin methods are identical. 

This chapter is structured as follows. After a brief section on the history of 
the finite-volume method we present, as always, the method in a nutshell. This 
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Fig. 8.1 Simulating the breaking of 
gravity waves 1s a challenging task. 
The finite-volume method was devel- 
oped with a view to solving problems 
with discontinuous solutions and com- 
plex geometrical features. 


Fig. 8.2 Topographic mesh. Triangu- 
lated topography of a seamount model 
lending itself to a volumetric discretiza- 
tion using tetrahedra. The finite-volume 
method offers an elegant solution to prob- 
lems on tetrahedral meshes. 


is followed by a derivation of the numerical solution of the 1D scalar advection 
and the elastic wave equation using the finite-volume method from first princi- 
ples. Finally, we analyse examples and compare the results with other methods 
encountered so far. 


8.1 History 

The story of finite volumes in seismology is quickly told. The method itself 
appeared in the early nineties with applications primarily in plasma physics (Her- 
meline, 1993), and computational fluid dynamics (Versteeg and Malalasekera, 
1995). In an effort to investigate its use for seismic wave propagation, Dormy 
and Tarantola (1995) used the discrete version of the divergence theorem in an 
elegant way to derive a scheme that allowed the simulation of wave propagation 
through arbitrary cell shapes. Their approach can be considered an extension 
of the staggered-grid finite-difference method to arbitrary geometries. In its 
basic form it converges to the classic staggered-grid finite-difference method 
for regular grids (Virieux, 1986) and to the solution for minimal (hexagonal) 
grids (Magnier et al., 1994). The method was tested in two dimensions in its 
lowest-order form. A more mathematical treatment of the finite-volume method 
was presented in Eymard et al. (2000). 

Kaser et al. (2001) and Kaser and Igel (2001) compared various numerical ap- 
proaches, including the finite-volume concept of Dormy and ‘Tarantola (1995), 
and quantified the accuracy of wave propagation modelled through unstructured 
grids in comparison with classic regular-grid methods. They concluded that, de- 
spite the flexibility of unstructured meshes, the price to pay is high in the sense 
that many more grid points per wavelength are necessary compared to classic 
techniques. This experience eventually led to the introduction of the high-order 
discontinuous Galerkin method to seismology. 

High-order extensions were presented for advection-type problems by 
Ollivier-Gooch and Van Altena (2002) and Wang (2002) for general conservation 
laws. Harder and Hansen (2005) applied the finite-volume method to geophysical 
fluid flow, and Tadi (2004) developed an algorithm for 2D elastic wave propaga- 
tion. In the comprehensive textbook by Leveque (2002) the finite-volume method 
is presented as a natural consequence of conservation laws. It is shown that the 
mathematical structure of elastic wave-propagation problems (in first-order form 
as a coupled system) is identical to the advection problem. This implies that the 
same numerical concepts developed for conservation laws can be directly ap- 
plied to elastic wave propagation. This approach was adopted by Dumbser et al. 
(2007a) who presented the arbitrary high-order scheme (ADER) of the finite- 
volume method for seismic wave propagation. This removes the argument used 
by many against the finite-volume method that it cannot be easily extended to 
higher orders. Dumbser et al. (2007a) show that the finite-volume method might 
well be a competitive scheme, in particular for triangular or tetrahedral meshes 


(see Fig. 8.2). Further applications in seismology were presented by Benjemaa 
et al. (2007) and Benjemaa et al. (2009) for the dynamic rupture problem. 

To my knowledge, the finite-volume method is not widely used today for 
large-scale problems in seismological research. As mentioned earlier, it can be un- 
derstood as a special case of the discontinuous Galerkin method, and combined 
use (low- and high-order) might be a useful strategy for some applications. 


8.2 Finite volumes in a nutshell 


The finite-volume method was developed around the problem of transporting 
(advecting) material and conserving the integral quantity. As the first-order linear 
advection problem is formally equivalent to the elastic wave-propagation prob- 
lem, all methods derived for the former equation equally apply. Therefore, we 
first present the solution to the scalar advection problem, later extending to more 
general cases. 

The finite-volume method in its basic form takes an entirely local viewpoint, 
in the sense that the solution field q(x, t) is tracked inside a representative cell of a 
finite volume. As the exact solution is not known, the field is approximated by an 
average quantity QO” inside cell @ as 


OF = =| q(x, t)dx. (8.1) 
dx € 

Here, as before, the lower index denotes cell @ and the upper index denotes time 
level t, = ndt. The cell @ is centred at x = x,, with left and right boundaries 
defined at x; — Sdx and x; + $dx, respectively (see Fig. 8.3). 

Tracking the change of the values with time inside each cell implies that we 
equate the change of quantity Q; from time step 7 to n+ 1 with the fluxes through 
the boundaries such that 


ov = OF - Frip —Fiap 
dt dx ° 


(8.2) 


where F” , y2 represent time integrals of the fluxes from time f, to ¢,,;. As infor- 


mation propagates with finite speed, it is reasonable to assume that, for example 
for the left boundary, the flux depends on the adjacent Q” values only: 


Fi j2 =f (Qi Q7). (8.3) 


The requirement of conservation for a transport (advection) problem leads 
(entirely from basic principles) to the advection equation of the form 


0,q(x, t) + ad,.g(x, t) = 0, (8.4) 


where a is a transport velocity. We will show in what follows that the elastic wave- 
propagation problem is formally equivalent to this equation and that all solution 
procedures derived for this simple advection problem can be used. 
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Fig. 8.3 Principle of the finite-volume 
method illustrated for the scalar advec- 
tion problem for a random initial field. 
Bottom: Snapshot in space of the so- 
lution field Q(x). Top: Close-up of a 
detail of the solution field for cell vol- 
umes of size dx = 3. The values Q; inside 
the volumes represent an average over the 
solution q(x) (dotted line). Using knowl- 
edge about the advection equation (here: 
the positive scalar advection velocity (a), 
the values in each volume are updated 


by estimating the fluxes F41/2 across the 
boundaries at x = x; + 5dx. 
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For the constant-coefficient advection problem the flux terms are simply 


in — 1 
Priya = aQ? 


, a (8.5) 

Pry ja: = aQ; 
where a is the advection velocity. With these definitions we obtain a fully discrete 
extrapolation scheme as 


d 
or! = OF + a= (OP, - Q). ey 


Here we used the fact that the advection occurs in one direction only (upwind 
scheme). It is interesting to note—you might have seen it already—that the result- 
ing numerical algorithm is equivalent to a simple finite-difference solution using a 
forward finite-difference scheme (i.e. upwind). However, this result was obtained 
with an entirely different approach. This will be further elaborated in the next 
section. 

From a practical point of view the first-order scheme just introduced is use- 
less, as it is highly dispersive. A considerably better solution is the Lax—Wendroff 
scheme, given as 


or! = gr adt 1 (= 


“on. -ot,)4_(™ 
Dax § int ~ Gia) 2 \ dx 


2 
) Q-20'+ OR), &N 
which is second-order accurate and much less dispersive (Leveque, 2002). 
From these very simple considerations a few important conclusions can be 
drawn. It appears that the finite-volume approach allows numerical schemes to 
be developed that are independent of cell shape, as long as appropriate fluxes 
are defined. Obviously, the assumption of constant values in each cell is sub- 
optimal. However, it leads in principle to finite-difference-type algorithms. This 
assumption can be relaxed, and recently arbitrary high-order reconstructions in- 
side grid cells have been proposed. The fundamental ingredient to finite-volume 
methods is the flux concept. The accurate calculation of the flux contribution to 
the cell update is related to the Rremann problem which considers the advection 
of a single discontinuity. Further interpretations and an alternative derivation of 
finite-volume concepts are given in Section 8.6. 


8.3 The finite-volume method via 
conservation laws 


Finite-volume methods were motivated by the challenge of finding solutions 
to problems with strongly heterogeneous parameters and possible discontinu- 
ities in the solution (e.g. shock waves). At discontinuities the partial-differential 
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equations no longer hold. However, the integral equations, from which the finite- 
volume concepts are derived, are still valid, and this is a fundamental difference 
to other numerical approaches. From this point of view it is not obvious that the 
seismic wave-propagation problem is a good candidate for this method. However, 
there are situations in which classic numerical methods like finite differences break 
down. In addition, the theoretical concepts introduced here will form the basis for 
the discontinuous Galerkin method presented in Chapter 9. 

To understand the fundamental concepts of finite-volume methods it is suf- 
ficient to consider the scalar advection equation that naturally follows from 
conserving mass in a flowing (advecting) system. In addition, the finite-volume 
concepts provide an alternative way to derive the equations for elastic wave mo- 
tion from first principles. We will tightly follow the concepts presented in the 
excellent book by Leveque (2002). 

We will then show that the problem of elastic wave propagation can be formu- 
lated as a coupled first-order system formally equivalent to the advection problem. 
In that sense the constant coefficient linear elastic wave equation can be viewed 
as a conservation law for stress and velocity (or elastic energy). In what follows 
we will restrict ourselves to methods that assume constant (average) values inside 
the cell volumes. Extensions to higher-order representations inside the cells are 
briefly discussed at the end of this chapter. 

In the mid nineties an alternative way of presenting the finite-volume con- 
cept was introduced by Dormy and ‘Tarantola (1995), taking Gauss’s theorem 
as a starting point. As I find this approach very attractive it will also be briefly 
presented. 

Let us start by posing a simple question regarding the transport (advection) of 
something (e.g. a tracer density in a flowing river, an isotope in an ocean, etc.). 
To describe what is happening we put ourselves into a finite volume cell that we 
denote as @ (to keep it simple we stick here to 1D) and define the cell as limited 
by x € x;,x,. We further assume a positive advection speed a. This set-up is 
illustrated in Fig. 8.4. The total mass of a quantity (e.g. tracer density, pressure) 
inside the cell is 


/ ~ g(x, t) dee (8.8) 
x] 
and a change in time can only be due to fluxes across the left and/or right cell 
boundaries. Thus 

a f° asd = FO - FO: (8.9) 

x] 

where F;(t) are rates (e.g. in g/s) at which the quantity flows through the left and 
right boundaries. If we assume advection with a constant transport velocity a this 
flux is given as a function of the values of q(x, t) as 


F > fq 0) = aq 0); (8.10) 


x) x 


, 


Fig. 8.4 The finite-volume method 1s 
based on the concept of describing the 
evolution of a density (energy) field q(x,t) 
inside a finite volume @ over time. 
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in other words 


a, / "g(x, Dd = f (Cent) —f (Gs D). (8.11) 


x1 


This is called the integral form of a hyperbolic conservation law. We can now 
make use of the definition of integration and antiderivates to obtain 


2, / " g(t t)de = / " af (ax, D)dx 
2 (8.12) 
y [aga 1) + Bef (alos 1))] de = , 


“1 


which leads to the well-known partial-differential equation of linear advection 


d1g(x5 t) + Oxf (G(x t)) = 0. (8.13) 


This simple and elegant derivation can also be developed for the problem of 
acoustic wave propagation, leading directly to the wave equation in first-order 
form. The interested reader is referred to Leveque (2002), Section 2.9.1. Lev- 
eque (2002) also makes the point that, despite the fact that the second-order wave 
equation is often described as the fundamental one in many books, it is in fact the 
first-order system that is the more fundamental equation. Also, efficient numeri- 
cal methods are more easily derived for first-order systems than for second-order 
systems. 


The upwind scheme 


At this point let us start developing a numerical approximation (i.e. discretization) 
that we can solve on a computer. To do this, instead of working on the field g(x, 2) 
itself we approximate the integral of g(x, 2) over the cell @ by 


1 
OQ!» =f q(x, t")dx. (8.14) 
dx 6 


This is the average value of q(x, £) inside the cell (see Fig. 8.5). In order to find an 
extrapolation scheme to approximate the future state of our finite-volume cells, 
we integrate Eq. 8.11 over time: 


[acne yar- f q(x, t")dx 
€ € 


Tn+1 tn+1 


= SQ 1) dt - S(Q@rs t)) dt, 


th th 


(8.15) 


where we rearranged terms and divided by dx in order to recover the average cell 
values, which we use in the numerical scheme below. Note that this equation is 
exact! 
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O@) 


Q; 


‘ (m) 


Finally, using the following terms for the fluxes at the boundaries, 


tn+1 


1 
Fi =. ACIOEE t))dt, 


8.16 
dt J, ( ) 


we obtain a first-order time-discrete scheme for the average values of our solution 
field g(x, t) (see Eq. 8.14): 


or = ot 2 e_ ny, (8.17) 
dx 

where the upper index 7 denotes time level t,, = mdt and the lower index 7 denotes 

cell ; of size dx. It is worth noting that we reformulated our problem—defined as 

a space-time partial differential equation—without making use of space deriva- 

tives. The holy grail of finite-volume methods is an accurate representation of the 

flux terms in Eq. 8.16. 

The simplest, and in practice most often-used, numerical flux is developed 
using the physics of the problem itself. We know that for hyperbolic problems the 
mass (tracer, energy, information) propagates along so-called characteristics. In 
seismological terms this is related to the question of how far a point of constant 
phase propagates in a time interval dz. This is illustrated in Fig. 8.6, here defining 
x; to be the cell boundary coordinates and dx the constant cell size. 

We thus seek to approximate the next cell update onl, knowing that 

OM! = g(x; t"*!) = q(x: — adt, 1"). (8.18) 
Information can only come from the cell to the left O” ,, and on! obviously will 
only change if the adjacent cells have different averages. Thus, we can predict 
the new cell average owl analytically by adding the appropriate mass flowing via 
the left boundary by interpolation. This comes down to simply calculating the 
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Fig. 8.5 The method: 
The cell averages (solid lines) are bal- 
anced by fluxes F,,, through the left and 
right boundaries. In this illustration 


finite-volume 


there 1s advection to the right only 
(positive advection speed a). 


O-; OQ; x 


Fig. 8.6 The upwind method. For the 
linear advection problem with positive 
speed a we can analytically predict where 
the tracer (information, seismic phase) 
will be located after time dt. The value 
of q(xj; t”*!) will be exactly the same as 
q(x; — adt, t"). We can use this infor- 
mation to predict the new cell average 


onl 
ae 
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Fig. 8.7 Looking upwind. Sometimes it 


is obvious where the wind is coming 
from (picture taken near South Point, 
Hawai’1—one of the windiest places on 
Earth). 


1! Another way of writing this is 


and this should ring a bell! If dt is too 
large we would advect beyond the next cell 
boundary invalidating our assumptions. Of 
course this corresponds to a stability crite- 
rion (see discussion below). 


fraction of QO” — QO”, that makes it into the cell during time dz. We know velocity a 
and according to Fig. 8.6 this fraction is (dx — a dt)/dx.! We thus obtain 


7 7 dx-adt ,.,, i 
Ort = OF + |G, 2 - OF) 


(8.19) 
n+1 n adt n adt 
. = . 1 -— + 4 —t 
Q: Q: ( dx ) Tl dy 
After re-arranging we finally obtain a fully discrete scheme 
adt 
Opt = OF- (QF - OF) (8.20) 


using only values from the direction where the ‘wind’ is coming from (here, 
from the left, see Fig. 8.7). This is the classic upwind scheme used in many 
advection-type problems (e.g. meteorology, ocean circulation). Referring to the 
semi-discrete integral Eq. 8.17 we can denote the right and left numerical fluxes as 


Fi = aQ? 


(8.21) 
Fy = aQ?. 


Surprise, surprise! The experienced finite-difference modeller immediately 
recognizes this result. Eq. 8.20 is nothing but a finite-difference approximation of 
the scalar advection problem using a backward definition of the finite difference. 
The finite-volume approach does often lead to finite-difference-like algorithms, 
but to state that it is nothing but a simple finite-difference scheme is not correct. 
The 1D example illustrated here does not highlight the power of this concept for 
problems with arbitrary cell shapes and very strong discontinuities! More about 
this in Section 8.6. 

The numerical scheme developed so far is further illustrated in Fig. 8.8 for 
both possible flow directions. Knowing the structure of the advection problem, 
the cell updates can be analytically calculated using only information from adja- 
cent cells. In terms of large-scale computations this is important. Finite-volume 
schemes are always explicit and local schemes where the future of physical 
systems are estimated by only looking at the immediate neighbourhood. Such 
schemes lend themselves to efficient parallelization. 

As demonstrated above, for the scalar advection problem the question of how 
cell averages have to be updated at both sides of a cell (element) boundary could 
be answered in a straightforward way using analytical solutions to the advection 
problem. For more complicated hyperbolic systems (like the elastic wave equa- 
tion) the solution to this problem is more difficult. The general solution to this 
is called the Rremann problem and we will make use of the related concepts when 
discussing wave propagation in heterogeneous elastic media. 

Finally, a word on stability. By looking at Eq. 8.19 we realize that this equation 
only makes sense if 


|—-| <1, (8.22) 
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because otherwise the information (mass, tracer, energy, etc.) would propagate 
beyond the next cell boundary within the time interval dt, and the cell update 
would be wrong. Note also, as was the case with the lowest-order finite-difference 
method, a special case is adt/dx = 1, where the solution is exact to numerical 
precision, which, however, has no practical significance for realistic cases. 

Without providing an analytical proof we note that the simple upwind scheme 
just developed is of first-order accuracy only and very dispersive. Thus it is not 
accurate enough to be of any use for actual simulation tasks. However, the meth- 
ods based on constant cell averages can be extended to higher orders. This will 
be demonstrated in the next section. 


The Lax-—Wendroff scheme 


In this section we will encounter a mathematical trick with which high-order ex- 
trapolation schemes can be developed. This approach was first presented by Lax 
and Wendroff (1960), making use of a concept called the Cauchy—Kowaleski 
procedure which replaces all the time derivatives by space derivatives using the 
original partial-differential equation. 

Our goal is to find solutions to 0,Q + ad,Q = 0. We start by using the Taylor 
expansion to extrapolate Q(x, 2) in time to obtain 


O(x, t”*!) = O(x, t”) + dtd,O(x, t”) + sae ae O(x, t”) +... (8.23) 


From the governing equation we are also able to state by additional differentiations 


07 O = -a2,.3,0 
0,0;Q = 0,0;Q = d,(—ad,Q) (8.24) 

Ojo LO. 
noting that we just derived the second order form of the acoustic wave equation 
(space-time dependencies omitted for brevity). We can now proceed and replace 


the time derivatives in Eq. 8.23 with the equivalent expressions containing space 
derivatives only and obtain 


Ola #1) = Ole t)— dt aCe) + 5a? aP OG 1) +... (8.25) 


Using central differencing schemes for both space derivatives 


.0(0; 0) ~ Sea Se 

y 0" +@Q” 28) 
82 Ox, t”) ~ 7+1 1 1-1 
‘ dx? 


we finally obtain a fully discrete second-order scheme for the extrapolation of our 
cell average Q, with the upper index denoting time and the lower index denoting 
space discretization 


Cc 
O.; oF Ort 
Tn ii 
Ly 
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Fig. 8.8 The finite-volume method, nu- 
merical fluxes. a: Average cell values Q” 
at time step n. b: Information propa- 
gates along so-called characeristic curves 
a distance adt into adjacent cells. c: Dif- 
Jerences in cell averages have propagated 
from left boundaries into adjacent cells 
and new cell averages can be updated 
analytically. 
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Table 8.1 Simulation parameters for 
scalar 1D advection with the finite- 
volume method. 


Parameter Value 
ips 75,000 
nx 6,000 

c 2,500 m/s 
dt 0.0025 s 
dx 12.5m 

€ = cdt/dx 0.9 

o (Gauss) 200 m 

xo 1,000 m 


+1 n adt n n 1 adt ‘ n Nn n 
OF; = OF - 5ay (2H - QF) + s\ a (Qi, -2Q7 + Q? 1), (8.27) 


known as the Lax—Wendroff scheme. 

Here we derived this scheme using standard finite-difference considerations 
without making use of flux concepts. However, this result can also be obtained by 
extending the finite-volume method towards higher order by approximating the 
solution inside the finite volume as a piecewise linear function of which the slope is 
determined by interpolation; see Leveque (2002) for details. The choice of slope 
considered (upwind, downwind, centred) then determines the specific second- 
order numerical scheme that evolves. The Lax—Wendroff scheme corresponds to 
the use of the downwind slope. 

The Lax—Wendroff scheme can also be interpreted as a finite-volume method 
by considering the flux functions 


1 dt 
Fi = 5a(Qh, + O})-55-@(O}- Oh, 


(8.28) 


NI RP NIR 


HL A Wa 1 dt Wa A 
Fr = 5a(Q? + O?,,) - 5 et (int - OF) 


that enter Eq. 8.17. For more details on high-order finite-volume schemereader is 
referred to Leveque (2002). We are now ready to implement our first numerical 
finite-volume scheme for the scalar advection problem. 


8.4 Scalar advection in 1D 


We proceed with implementing the two numerical schemes: (1) the upwind 
method and (2) the Lax—Wendroff scheme. Recalling their formulations, 


adt 


nt+1 __ yo ee nm cn 
OF" = OF rr (Qi - OF.) (8.29) 
and 
d 1 (adt\? 
ort = or - (on, - ON) +5 (=) (O%,-207+0%,), 8.30) 


it is straightforward to initialize the discrete solution field Q and make an appro- 
priate choice for the remaining parameters. An example is given in Table 8.1. To 
keep the problem simple we use a spatial initial condition, a Gauss function with 
half-width o, 


Q(x, t = 0) = eer e*0)? (8.31) 


which is advected with speed c = 2,500 m/s. The analytical solution to this prob- 
lem is a simple translation of the initial condition to x = xo + ct, where t = j dt is 
the simulation time at time step /. 


# [...] 
# Time extrapolation 
for j in range(nt): 
# upwind 
if method == ‘upwind’: 
for i in range(1,nx-1): 
# Forward (upwind) (c>0) 
dQti] = (Qli] - Qfi-1])/dx 
# Time extrapolation 
Q=Q- dt * cx dQ 
# Lax Wendroff 
if method == ‘Lax-Wendroff’: 
for i in range(1, nx-1): 
# Forward (upwind) (c>0) 
dQifi] = Q[i+1] - 2 * Q[i] + Q[i-1] 
dQ2[i] = Q[i+1] - Q[i-1] 
# Time extrapolation 
Q=Q - c/2*dt/dx*dQ2 + 0.5*(cxdt/dx)**2 *dQl 
# Boundary condition 
# Periodic 
# Q[0] = Q[nx-2] 
# Absorbing 
Q[nx-1] = Q[nx-2] 
# [...] 


The code snippet above illustrates the Python implementation of the time 
extrapolation discussed earlier omitting the specification part. The fields 
Q, dQ, dQ1, dQ2 are vectors with nx elements. Before the time extrapolation, QO is 
initialized with the Gaussian function specified above. 

Note that the spatial loop omits the boundary points; this allows us to imple- 
ment specific boundary conditions (e.g. circular, absorbing, reflecting) by using 
so-called ghost cells. In this case the physical boundaries are the left and right 
limits of cells 2 and nx — 1, respectively. In the case of positive advection veloc- 
ity c, we can implement periodic and absorbing boundary conditions with the 


statements 
Periodic: Q7 = Q” 
Oi = Oe (8.32) 
Absorbing: OF. = O".4 


and according statements for negative advection speeds or propagation in both 
directions (see Fig. 8.9). We illustrate the algorithm for a homogeneous medium 
with constant scalar advection velocity c. The heterogeneous case is studied for 
the elastic wave-propagation problem. 

The results of the simulation examples are shown in Fig. 8.10. Compar- 
ison with the analytical solution of this simple advection problem illustrates 
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1 zZ sae nx-1 nx 


Xo Xnax 
Fig. 8.9 Boundary conditions. Absorb- 
ing or circular boundary conditions can 
be implemented by using ghost cells out- 
side the physical domain x € [x03 Xmax]- 
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Fig. 8.10 The finite-volume method, 
scalar advection. Simulation examples 
for the scalar advection problem, param- 
eters given in Table 8.1. Top: Snapshots 
of an advected Gauss function (analyt- 
ical solution, dotted line) are compared 
with the numerical solution of the first- 
order upwind method (solid line) and 
the second-order Lax—Wendroff scheme 
(dashed line) for increasing propagation 
distances. Bottom: The same for a 
boxcar function. In both cases the size of 
the window is 1.2 km. 


? Note that this numerical diffusion can 
be derived analytically using Fourier analy- 
sis similar to the approach taken to under- 
stand numerical dispersion in the case of 
the finite-difference method. See Leveque 
(2002) for details. 
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the increasing diffusion of the original signal in the case of the first-order up- 
wind method.” This disqualifies the upwind scheme for any realistic problem. 
The second-order Lax—Wendroff scheme does not show this strong diffusive 
behaviour; the peak amplitude is basically stable and the Gaussian waveform re- 
mains more or less unchanged. However, a slight shift of the original waveform 
with reference to the analytical solution develops with increasing propagation 
distance. This is also a well-known phenomenon that has to be taken into 
account when deciding on a final parameter set-up for a specific simulation 
problem. Fig. 8.10 (bottom row) illustrates the behaviour of the numerical so- 
lutions for a boxcar initial condition. Again, the upwind scheme diffuses the 
initial signal in a non-physical way. However, the second-order Lax—Wendroff 
scheme is not capable of advecting the boxcar function accurately. For seis- 
mic wave-propagation tasks this is usually no problem as we propagate smooth 
wavefields. For other physical problems, that contain discontinuities such as 
shock waves, more effort has to be undertaken to avoid this numerical be- 
haviour, for example by using adaptive meshes refining during runtime at the 
discontinuities. 

Concluding this introduction of the simplest finite-volume scheme we note 
that we developed a numerical scheme from first principles, advecting cell av- 
erages by analytically predicting the flow across cell boundaries (adopting the 
analytical solution to the advection problem). What remains to be shown is how 
this relates to the problem of elastic wave propagation. This will be the topic of 
the following sections. 


8.5 Elastic waves in 1D 


Let us look at the source-free version of the coupled first-order elastic wave equa- 
tion in 1D that we encountered when we discussed staggered finite-difference 
schemes. Denoting v = v(x, t) as the transverse velocity and o = o,,(x,t) as the 
shear stress component, we obtain 


0,0 — L0,v = 0 


; (8.33) 
0,U- —d0,0 = 0. 
p 


where p is density and w is the shear modulus. They can in general be space- 
dependent, but for the moment we assume they are constant. Here, we encounter 
for the first time a formulation of this equation that is fundamental for the 
treatment of coupled hyperbolic equations as linear systems. This applies in par- 
ticular to the finite-volume method and the discontinuous Galerkin discussed in 
Chapter 9. 

We proceed by writing this equation in matrix—vector notation 


0,Q + Ad,Q = 0, (8.34) 
where Q = (0,v)r is the vector of unknowns and matrix A contains the 
parameters 

reed ee a (8.35) 

-1/p 0 


The above matrix equation is formally analogous to the simple advection equation 
0,f —ad,f = 0, which is descriptive of many physical phenomena. But it is coupled. 
To be able to apply the previously developed numerical tools to this equation we 
have to find a way to decouple it. This is the main purpose of this section. It will 
also allow us to come up with an analytical solution. 

What needs to be done is to demonstrate the hyperbolicity of the wave equation 
in this form; that is, to show that A is diagonalizable. If we succeed, it will suf- 
fice to discuss the scalar form of this equation that will then be extendible to the 
vector—matrix form in a straightforward way. Mathematically, it means we have to 
diagonalize the matrix A, which will allow us to decouple the two equations. 

In the case of a quadratic matrix A with shape m x m (m = 2 in our case), this 
obviously leads to an eigenvalue problem. If we are able to obtain eigenvalues A, 
such that 


AXp = ApXp, Pp = 1,...,m, (8.36) 
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3 Watch out, the standard procedure 
leads to a truism. One of the compo- 
nents is set to 1 which constrains the other 
component. 


we get a diagonal matrix of eigenvalues 


A = a (8.37) 


and the corresponding matrix R containing the eigenvectors x, in each column: 
R = (x1 |x2|... |X»). (8.38) 


The matrix A can now be expressed with the definitions 


A=RAR?! 
(8.39) 
A=R'AR. 
Applying these definitions to Eq. 8.34 we obtain 
R'd,Q+ R'RAR14,Q = 0, (8.40) 
and introducing the solution vector W = RQ results in 
0,W + Ad, W = 0. (8.41) 


Bingo! As A is diagonal we now have two decoupled advection equations. Note 
that this means we have—in the eigensystem—two entirely decoupled scalar 
advection equations. 

What remains to be shown is that in our specific case A has real eigenvalues. 
These are easily determined as 41,2 = +./u/p = $c, corresponding to the shear 
velocity c. For the eigenvectors? we obtain 


r = a) 512 = Gar (8.42) 


which, interestingly enough, contain as first elements values of the seismic 
impedance Z = pc relevant for the reflection behaviour of seismic waves. Thus, 
the matrix R and its inverse are 


_(Z-Z ig kp PZ 
n= (; 4 R - (27). 6.43) 


While all this seems rather theoretical, it has important practical consequences. 
The decoupling of the matrix—vector equation initially stated implies that we can 
proceed by employing the same numerical solution methods developed for the 
scalar linear advection equation. The eigenvalues take the meaning of the trans- 
port velocity. It is instructive to discuss the analytical solution to the homogeneous 
equation (for the initial value problem) as it can later be used as a benchmark. 


The wave equation in the rotated eigensystem can be stated as 


a, (" es & ) a: i =0, (8.44) 
W2 Oc W2 


with the simple general solution w;,2 = wi) (x:ct), where the upper index 0 stands 
for the initial condition (i.e. waveform that is advected). The initial condition 
also fulfills W® = R™'Q®. We can therefore relate the so-called characteristic 
variables zw ,2 to the initial conditions of the physical variables as 


w (x,t) = : (6 (x + ct) + Zu (x + ct)) 
4 (8.45) 

w2(x,t) = — (0 (x= ct) + Zv (x - ct) 
2Z 


to obtain the final analytical solution for velocity v and stress o using Q = RW as 
1 
0 (x,t) 75 (© (x + ct) ta (x= ct) 


+ e (v (x + ct) — v© (x= ct) 
: 2 (8.46) 
v(x, t) ar ae (x + ct) -—0 (x= <ct)) 


1 
2G (v (x + ct) + v (x= ct)). 


In physical terms one can see that any initial condition in either stress 0 or 
velocity v is coupled with the other variable and advected in both +x directions 
with velocity c. In compact form, with the above definitions, this solution can be 
expressed as 


m 
Q(x, 1) = D> wo(x, Drp, (8.47) 
p=l 
meaning that any solution is a sum over weighted eigenvectors, a superposition 
of m waves, each advected without change in shape. The pth wave has the shape 
wy ry and propagates with a velocity corresponding to eigenvalue A». 

Given the hyperbolicity of our linear system of equations descriptive of elastic 
wave propagation, we can directly apply the numerical schemes developed for the 
scalar advection case with only slight modifications. To illustrate some fundamen- 
tal approaches to the flux calculations, we start with the case of homogeneous 
material. 


8.5.1 Homogeneous case 


To develop the numerical scheme for elastic wave propagation, we encounter 
a fundamental problem, briefly mentioned above: the Riemann* problem. This 
problem consists of a single discontinuity as initial condition to an advection prob- 
lem (or any hyperbolic partial differential equation). This problem is illustrated 


Elastic wavesin 1D 225 


* Bernhard Riemann (1826-1866) is 
mostly known for his ground-breaking 
work in analysis and differential geometry. 
He died young, of tuberculosis. 
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Fig. 8.11 Riemann problem, homoge- 
neous case. Top: A discontinuity AQ 
1s located at x = 0 as initial condition 
to the advection equation (e.g. as ini- 
tial stress discontinuity). Bottom: The 
discontinuity propagates along charac- 
teristic curves in the space-time domain. 
The figure illustrates adjacent cells and 
two time levels t, and ty. Two waves 
propagate in opposite directions modi- 
fying the values in the cells adjacent 
tox=0. 


in Fig. 8.11. Why do we need to deal with this? We do not assume continuity of 
our solution field at the cell boundaries (as was necessary in the finite- or spectral- 
element method). Thus we have to calculate at each time step how much of the 
discontinuous field is flowing through the boundary in both directions. Riemann 
provided a general solution to this problem. 


The Riemann problem 


To make the link to our problem of elastic wave propagation note that, according 
to Eq. 8.47, the solution to our problem is a superposition of weighted eigenvec- 
tors rp, in our case p = 1,2. We do not know how they are partitioned. Therefore, 
we can decompose the discontinuity jump into these eigenvectors with fractions 
a, and a2, which we can determine according to 


AQ = Q,-Q) = air + O2r2 
Ra = AQ 
a=R'AQ, 


(8.48) 


where R is the matrix of eigenvectors as defined above. As expected—for exam- 
ple for a stress discontinuity AQ = [1,0]—we obtain two waves propagating in 
opposite directions with equal weights a = [0.5,0.5] (see exercises). The cen- 
tral concept of the flux calculations for finite-volume and discontinuous Galerkin 
methods is simply realizing that such a discontinuity propagates cdt into adja- 
cent cells, therefore changing the cell value by an amount cdt/dx AQ. This is the 
approach taken in the scalar case. What makes the problem for linear systems 
more complicated is the fact that we need, first, to decompose the problem into 
its eigenvectors. 

An elegant way of describing this for the elastic problem is to decompose 
the solution into positive (right-propagating) and negative (left-propagating) 


ies -c 0 At = 00 
00 Oc 


Then we can derive matrices A*, corresponding to the advection velocity in the 


eigenvalues: 


(8.49) 


scalar case 


A* =RA‘*R™ 
(8.50) 
A =RAR’, 
allowing us to calculate analytically the fluxes in a finite-volume cell. This is 
graphically illustrated in Fig. 8.12 for one of the vector components of Q. For 
cell 7 we have to propagate the solution fields Q from adjacent cells in both direc- 
tions from the corresponding left and right boundaries. Taking the view of cell 7 
we have: 


=A'Q: 


+ A*Q;; > left incoming 


— right outgoing 


(8.51) 
+ A°Q;,1 > right incoming 


= Ac Q; 


— left outgoing. 


Adding these contributions we obtain 


A*(Q:1-Q:) + A (Qi+1 - Q). (8.52) 


Defining the cell differences as 


AQ, = Q; - Q- 
Q = Q- Qi1 (8.53) 
AQ, = Qi+1 7“ Q; 
we can formulate an upwind finite-volume scheme for any linear hyperbolic 
system as 


at 
Qi"! = Q?- a (A* AQ; + ATAQ,). (8.54) 
ne 
Please note that we recover the signs of the flux contributions to cell 7 in Eq. 8.51 
by recognizing that the terms with Av are negative (in the scalar case correspond- 


ing to the term —c). We can relate this formulation to the basic flux concept of 
Eq. 8.21 with the definitions 


= - 

F; = ATAQ, (8.55) 

F, = A AQ,. 
As was already discussed in connection with the scalar advection problem, the 
first-order upwind solution is of no practical use because of its strong diffusive 
behaviour. Therefore we present the second-order Lax—Wendroff scheme analo- 
gous to the scalar advection case. Interestingly, the high-order scheme does not 
necessitate the separation into eigenvectors and the matrix A can be used in its 
original form. The extrapolation scheme reads 


dt 
ar =" AQ, -Qh, 
x, 
1 dr? ais 
+ 3 he A’ (Qi, — 2Q/ + Qi. ,). 


Let us illustrate the solution using the Lax—Wendroff scheme. The parameters for 
the simulation are given in Table 8.2. The 1D medium is divided into volume cells 
of equal size. The time-extrapolation scheme can be implemented as presented in 
the following code snippet: 
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Qi Q 


Qist x 


Fig. 8.12 Finite-volume elastic wave 
The 


of Q 


are illustrated for three adjacent cells 


propagation, homogeneous case. 
constant average cell values 


1-1,1,1+1. The eigenvector decomposi- 
tion leads to wave A* propagating from 
the left boundary with velocity c into cell 
1 and wave A” propgating with veloc- 
ity —c into celli from the right boundary. 
This determines the flux of discontinu- 
ities AQ), into cell 1 by the amount 
cdt/dxAQ),. 


Table 8.2 Simulation parameters for 
1D elastic wave propagation. Homoge- 


neous case. 

Parameter Value 
Xmax 10,000 m 
nx 800 

c 2,500 m/s 
p 2,500 kg/m? 
dt 0.025 s 
dx 12.5m 

€ 0.5 

o (Gauss) 200 m 

xo 5,000 m 
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# [...] 
# Specifications 
Q = zeros ((2,nx) ) 
Qnew = zeros ((2,nx) ) 
# Initial condition 
Qnew[0,:] = exp(-1/sig**2 * (x - x0) **2) 
# Time extrapolation 
for j in range(nt): 
# Lax Wendroff scheme 
Q = Qnew 
for i in range(1,nx-1): 
dQi = Q[:,1i+1] - Q[:,i-1] 
dQo2 = Q[:,i-1] - 2*Q[:,1i] + Q[:,1i+1] 
Qnew[:, i] = Q[:,i] - dt/(2*dx) * A @ dQl + \ 
0.5 *(dt/dx)**2 * (A @ A) @ dQ2 
# Absorbing boundary conditions 
Qnew[:,0] = Qnew[:,1] 
Qnew[:,nx-1] = Qnew[:,nx-2] 
# [...] 


end 


Note here that the solution vector has the shape Q(2, nx) containing stress val- 
ues in Q(1,:) and velocity values in Q(2,:) for each volume cell of constant size 
dx. Note that the size of the grid cells can easily be modified provided that the 
Courant criterion is satisfied through the global time step. 

Results of the simulations are shown in Fig. 8.13 for both solution fields stress 
(top) and velocity (bottom). The initial condition is a Gaussian-shaped function 
in stress that is advected in both positive and negative directions with velocity c, 
corresponding to the modulus of the eigenvalues of matrix A. The numerical so- 
lution is shown along with the analytical solution. With this simulation set-up they 
are indistinguishable. An extensive investigation of the accuracy of this approach 
is left for the computational exercises. 


8.5.2 Heterogeneous case 


Fortunately, the extension to the heterogeneous case is straightforward. The 
coefficient matrix A is allowed to vary for each element 7 as 


_{ 9 -K: 
me es ; (8.57) 


We use again the concept of separating the left- and right-propagating wavefields, 
defining for each element matrices 


Time t = 1.47685 s 


Stress (Pa) 


Velocity (10° m/s) 


0 5,000 10,000 
x (m) 
—c; 0 00 
AV = 7 At = : 
(52) .ar=(°9) = 
and using the definitions 
Z; -Z; 
R= « 1 (8.59) 


for the matrix with eigenvectors describing the solutions inside element 7 and 
Zi = pi ci. We can determine the corresponding advection terms as 


Ay =RA;R" 
A; = RA;R™ en 


We can now write down the Euler scheme for the elastic wave equation in the 
heterogeneous case to obtain 


AQ: = Q: = Q-1 
AQ, = Qivt = Q; (8.61) 


dt 
qv! = Q"- 5, Ai AQ + A; AQ,). 
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Fig. 8.13 Finite-volume solution for 
elastic wave propagation: homogeneous 
case. The stress—velocity system is ad- 
vected for an initial condition of Gaus- 
sian shape (top, dashed line, scaled by 
factor 1/2). Top: Stress snapshot at time 
t= 1.5 s. Bottom: Velocity snapshot at 
the same time. In both cases analytical 
solutions are superimposed. 


230 The Finite-Volume Method 


> The flux scheme used for the homo- 
geneous and heterogeneous cases is called 
the Godunov upwind flux. Exactly the 
same flux scheme is applied to the discon- 
tinuous Galerkin method. It is important 
to note that this flux scheme requires con- 
stant parameters inside the cells/elements. 
Other schemes (like so-called fluctuations) 
have to be used for schemes with param- 
eters varying inside the cells/elements. See 
Leveque (2002) for details. 


Fig. 8.14 The finite-volume method, 
elastic waves in a heterogeneous medium. 
The two solution fields (stresses Q(1,:) 
left column, velocities (Q(2,:) right col- 
umn, both solid lines) are shown for 
various time steps for an initial stress 
condition (dashed line, left column). We 
see stress waves propagating away from 
the source point and antisymmetric ve- 
locity components. At the interface the 
right-propagating wave speeds up, and 
a reflection propagates back into the 
left material. The staggered-grid fintte- 
difference solution (without absorbing 
boundary) is superimposed (dotted line). 


This numerical solution is again too dispersive to be of any use in practice. How- 
ever, it contains the flux scheme that is also used in the high-order discontinuous 
Galerkin method introduced in the next chapter. 

As was the case in homogeneous media, a second-order scheme can be 
obtained with a Lax—Wendroff scheme that does not require separating the wave- 
fields (see Leveque, 2002, for a full derivation). Using the above definition for A; 
the Lax—Wendroff finite-volume scheme can be written as 


AQ: = Q: - Qi-1 
AQ, = Qi+1 = Qi 
r= Q'-“a,1aQ+ aQi (8.62) 
Lfit\? os 
+3 ({) a¢1a@-aaa. 


This scheme is implemented in the following Python code fragment with the co- 
efficient matrices of shape A(2,2,nx) defined for each nx finite-volume cells. 
Note here that changing the cell size is trivial and would only make the dx in the 
above equation space-dependent (see computer exercises). 


Stress (Pa) - t= 0.3758 Velocity (10% m/s) -t=0.375s 


2,000 


4,000 6,000 


x (m) 


0 2,000 4,000 6,000 8,000 


x (m) 


8,000 10,000 9 10,000 


# [...] 
# Time extrapolation 
for j in range(nt): 
for i in range(1,nx-1): 
dQl = Q[:,i] - Q[:,i-1] 
der = Q[:,i+1] - Q[:,i] 


Qnew[:,i] = Q[:,i]-dt/(2*dx)* A[:,:,i1] 
@ (dQl + dQr) + \ 
0.5* (dt/dx) **2*A[:,:,i] @ A[:,:,i] @ (dQr - dQl) 


# Absorbing boundary conditions 
Qnew[:, 0] = Qnewl[:, 1] 
Qnew[:, nx-1] = Qnew[:, nx-2] 


# [...] 


In our simulation example two media are separated by an interface. The left 
medium is the same as described in Table 8.2. The right medium has a veloc- 
ity twice as fast (the shear modulus is 1/4 the value on the left side). The time 
step is adapted accordingly. The results of the simulation are shown in Fig. 8.14 
and compared with the finite-difference method using a staggered-grid solution 
for the velocity—stress elastic wave equation using identical parameters. 

Both solutions are indistinguishable except that the finite-difference solution 
(dotted line) is reflected from the domain boundaries as no absorbing condition 
is implemented. The solution shows the expected transmission and reflection be- 
haviour at an elastic material interface. For the parameters used in this simulation 
the shape of the initial condition remains unchanged and no numerical dispersion 
is visible. 

We can conclude that we have provided a complete description of the elastic 
wave-propagation problem in 1D, with the assumption that (1) the solution fields 
(stress and velocities) as well as (2) the elastic parameters jz and A and thus the 
impedances Z = pc are constant inside the finite-volume cells. Note that these 
concepts can be applied to grid cells of any shape with straight boundary edges 
and this is the strength of the finite-volume method. 


8.5.3 The Riemann problem: heterogeneous case 


Despite the fact that we were able to make use of the theoretical developments for 
the homogeneous case directly for the heterogeneous case in the previous section, 
it is instructive to present the theory of the Riemann problem for a material inter- 
face. We will closely follow the notation given in Leveque (2002). In fact, it allows 
us to develop fundamental results for seismic wave propagation, and the reflection 
and transmission coefficients for perpendicular incidence. The problem is illus- 
trated in Fig. 8.15. At an interface the quantity Q to be advected is discontinuous. 
In addition, the advection velocities on both sides of the interface differ. 
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Fig. 8.15 Riemann problem, heteroge- 
neous case. Top: Single discontinu- 
ity separating two regions with differ- 
ent properties. Bottom: Velocities and 
impedances on both sides of the discon- 
tinuity. The Riemann problem solves the 
problem of how waves on both sides are 
partitioned. 
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22, 
Z,+Z, 


Za = prtr res 


Fig. 8.16 Reflection and transmission 
coefficients. Seismic waves incident per- 
pendicular to a material discontinuity 
are reflected and transmitted according 
to coefficients R and T. These coeffi- 
cients can be derived via the Riemann 
problem used to develop flux schemes for 
finite-volume methods. 


Mathematically the solution still has to consist of a weighted sum over eigen- 
vectors that now describe solutions in the left and right parts. Following the 
developments above, the eigenvectors for this problem are 


v-n( (6 


for some unknown scalar values a;,2. The first term corresponds to the left do- 


(8.63) 


main, and the second term to the right domain. This can be written as a linear 
system of the form 


R,.a = AQ, (8.64) 
where a is a vector and the matrix with eigenvector R;, is 
-Z, LZ, 
R;, = . 8.65 
( <a (8.65) 
with the inverse 
1 -1 Z, 
R; = ——=> "). 8.66 
z(t 4) oe) 


We want to know how these eigenvectors are partitioned (left and right prop- 
agating waves with different velocities) given the field discontinuity AQ. This 
discontinuity cannot be arbitrary. For example, it could correspond to an incident 
wave from the left. In this case the discontinuity to partition would correspond to 


--(() 


which could be arbitrarily scaled. Consequently, we ask what this implies for the 


(8.67) 


waves propagating in the right domain and, after a possible reflection, in the 
left domain. With the discontinuous material parameters in matrix R; we seek 
a such that 


a=R,/AQ 
.. & Maye 
Z+Z,\1 Z, 1 
(8.68) 
Zy-Z, 
(“)- zz | _|* 
aa 2Z 
Tah T 


and obtain—suprise, surprise—the well-known transmission (T) and reflec- 
tion (R) coefficients for perpendicular incidence at a material discontinuity (see 
Fig. 8.16). It is worth noting that this result is obtained from first principles, which 
simply require the conservation of a quantity (here, the vector of stress and ve- 
locity) and knowledge of the solution to an advection problem (assuming locally 
constant parameters). 

In the next section we present an alternative view of the finite-volume method. 


8.6 Derivation via Gauss’s theorem 


This view is based on discretizing Gauss’s theorem (also known as the diver- 
gence theorem), and follows the results of Dormy and ‘Tarantola (1995). Gauss’s 
theorem states that the outward flux of a vector field Q; (x, t) through a closed sur- 
face S is equal to the volume integral of the divergence over the volume V inside 
the surface at some time t. Mathematically this can be expressed as 


iE d;O;dV = [naas, (8.69) 


where 7; are the components of the local surface-normal vector. First of all, note 
that this formulation is not restricted to vector fields—it can generalize to any 
tensor field Q;;.., and of course also applies to scalar fields O(x,t). We will use the 
above relation to estimate partial derivatives of scalar fields (which could represent 
one component of a vector field). 

Assuming the gradient of the solution field is smooth enough and can be as- 
sumed constant inside volume V, we can take it out of the integral and obtain 
an expression for the derivative as a function of an integral over a surface S with 
segments dS in 3D or a line with segments dL surrounding a surface S in 2D: 


a0 | av = | moas 


1 

0;03p = 7 | mous (8.70) 
V ds 
1 

SOR = / nj Qa. 
S Ji 


An example in 2D with a surface consisting of linear segements (a polygon) is 
shown on Fig. 8.17. Once a discretization of surfaces or surrounding lines is found 
we can develop a discrete scheme replacing the integrals in the above equations 
by sums to obtain 


1 
aOsp © = Lee 
i (8.71) 
8:Oon © + Xu n® dL* Q*. 


The importance here is that this description is entirely independent of the shape 
of a particular volume. Provided that the space can be filled entirely with poly- 
gons, we have a numerical scheme that can be applied to solve partial differential 
equations. 

Let us take a simple example illustrated in Fig. 8.18. A rhombus-shaped finite- 
volume cell is described by four edge points: 


Pi= (-A;,0), P,= (+4;,0), 


(8.72) 
P35 (0,-A2), Py= (0, +A2) 
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Fig. 8.17 Concept of numerical gradi- 
ent calculation using Gauss’s theorem, 
illustrated in 2D. The constant gra- 
dient of a scalar field Q inside the 
finite volume S is approximated as 
0,0 = 1/S0, nf dL*Q*. The polygon 
can have any shape. 


(0,45) 


(-A,,0) (A,;0) 


(0,-A>) 


Fig. 8.18 Rhombus-shaped finite- 
volume cell used to illustrate the 
finite-volume approach based on 
Gauss’s theorem. 
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with the length of the sides given by € = ,/ A? + A} and the surface S = 2A, Ap. 
The four normal vectors are defined by 


(8.73) 


We now have all components necessary to apply Eq. 8.71 in the 2D case. We 
integrate along the paths suggested in the figure to obtain 


Az 
£ 


_£, Ar, Ao 
)+ 5 Ql +>) 


1 ¢ NS 
a10= oF 

10 5 G26 F 
e Ay As 


1 (8.74) 
= gO @142 + Q2 Az) 


— Q-Q 
~ 2A, 


for the first derivative of Q with reference to x. 
grapecevers o~ Maybe not so surprisingly—but again with an entirely different approach— 
\ fo f ‘ we recover the well-known finite-difference formula for the first derivative with a 
first-order accuracy. While this is an elegant approach, unfortunately, the method 
° is of low-order accuracy and can not in this form easily be extended to high-order 
accuracy. However, the applicability to arbitrarily shaped volumes is attractive. 
This allows solution to problems on entirely unstructured grids that can (in 2D) 


be triangulated and described with Voronoi cells. 


wW 000‘7 


Examples of operators so defined and applications to elastic wave-propagation 
problems can be found in Kadser et al. (2001) and Kaser and Igel (2001) (see 
Fig. 8.19). 

\} The strong desire to solve elastic wave-propagation problems on unstruc- 


2,000 m tured grids, and frustration with the low accuracy of the simple finite-volume 
approach and other low-order techniques, eventually led to the search for other 


Fig. 8.19 Example of elastic wave sim- more accurate approaches and eventually to the adaptation of the discontinu- 


ulations in 2D using difference opera- ous Galerkin method to seismic wave propagation (see Késer and Dumbser, 

tors based on the finite-volume approach 2006). 

(Kaser and Igel, 2001). In this exam- 

ple an unstructured grid follows a free- 

surface topography with ghost cells out- §,.7 The road to 3D 

side the surface to implement stress-free 

boundary conditions. Reprinted with Tp many publications in which numerical methods are compared, it is stated that 

permission. finite-volume methods are disadvantageous because they are of low-order accu- 
racy. In the developments illustrated above we restricted ourselves to the situation 


where the solution fields inside the cells are constant (or linear functions, one way 
of developing a second-order scheme like the Lax—Wendroff method). However, 
the restriction to low-order representations of the solution fields no longer applies: 
Dumbser et al. (2007a) presented an arbitrary high-order finite-volume scheme 
in space and time that allows the solution of viscoelastic wave propagation on un- 
structured meshes (see Fig. 8.20). They make use of the so-called ADER scheme 
(Arbitray high-order DERivative) developed by the group of Prof. Toro at the 
University of Trento (Titarev and Toro, 2002). 

Dumbser et al. (2007a) discuss in detail the differences of their high-order 
finite-volume approach compared with the discontinuous Galerkin method the 
same group developed shortly before (Kaser and Dumbser, 2006). They note that 
the finite-volume method has advantages concerning the number of degrees of 
freedom and the overall computation time. A disadvantage is the overhead when 
reconstructing the high-order representation inside the volume cells. Even in the 
high-order finite-volume approach it is still the cell average that is updated. As 
will be seen in Chapter 9 on the discontinuous Galerkin method, the fields inside 
elements are described by Lagrange polynomials and all polynomial coefficients 
are updated. Other developments, called spectral finite-volume methods, were 
pursued by Wang (2002). 

In my view the potential of the finite-volume approach for elastic wave- 
propagation and rupture problems has not yet been fully explored, and there is 
room for further studies and applications in Earth sciences. 


Chapter summary 


e The finite-volume method naturally follows from discretizing conserva- 
tion equations considering fluxes between finite-volume cells of averaged 
solution fields. 


e The fluxes across boundaries during an extrapolation step are estimated 
using solutions to the Riemann problem. 


e The Riemann problem considers the advection of a single-jump discon- 
tinuity, taking into account the analytical solution of the homogeneous 
problem. It allows an analytical prediction of how much of the material 
(energy, stress, etc.) enters into or leaves a cell. 


e The lowest-order finite-volume solution to the advection equation leads to 
a finite-difference algorithm with a forward (or backward) spatial differ- 
encing scheme, depending on the advection direction. This is called an 
upwind scheme. 

e First-order finite-volume schemes are highly dispersive and are not appro- 
priate for the solution of wave-propagation problems. The second-order 
Lax—Wendroff scheme does a much better job. 
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Fig. 8.20 Computational mesh for 
finite-volume solution of wave propaga- 


tion. The mesh based on tetrahedra is 
refined towards the centre of the model, 
where simulations are required to be 
highly accurate. From Dumbser et al. 
(2007a). 
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The problem of elastic wave propagation can thus be formally cast as a 
first-order hyperbolic problem. Thus, and with only slight modifications, 
the fundamental schemes developed for the scalar advection problem can 
be applied to elastic wave propagation. 


In the finite-volume method the problem of estimating partial derivatives 
(finite differences) is replaced by the requirement to accurately calculate 
fluxes across cell boundaries. 


A major advantage of the finite-volume method is the fact that the scheme 
can be easily applied to volume cells of any shape. 
Finite-volume schemes for arbitrary high-order reconstructions inside the 


cells and high-order time-extrapolation schemes have been developed but 
not used extensively in seismology. 


FURTHER READING 


e Leveque (2002) provides an extensive discussion of the finite-volume 


method for the advection and acoustic/elastic wave-propagation problem. 
Supplementary electronic material for some of the numerical solutions is 
also available. The text provides an in-depth discussion of the Riemann 
problem and its impact for different physical problems. 


The most general formulation of the elastic wave equation for arbitrary 
high-order accuracy can be found in Dumbser et al. (2007a). Given the 
thorough discussion of the benefits of the finite-volume method over other 
schemes in some cases, it is surprising that this approach has so far not been 
used for research applications. 


EXERCISES 
Comprehension questions 


(8.1) What is the connection between finite-volume methods and conservation 


equations? 


(8.2) What is meant by a finite volume; is there any difference between this and 


a finite element? 


(8.3) If you look at the upwind approach to the scalar advection prob- 


lem (Eq. 8.29), why is the finite-volume method so closely linked to 
staggered-grid finite-difference schemes? Explain. 


(8.4) What are the main advantages of finite-volume methods compared with 


finite-difference methods? 


(8.5) 


(8.6) 


(8.7) 


(8.8) 


(8.9) 


Explain the Riemann problem and illustrate why it is so essential for 
finite-volume schemes. 

In what areas of natural sciences are finite-volume schemes mostly used? 
Explore the literature and try to give reasons. 

What is numerical diffusion? Why is it relevant for finite-volume meth- 
ods? 

What is the connection between reflection/transmission coefficients of 
seismic waves and the finite-volume method? 

The finite-volume method extrapolates cell averages. What strategies do 
you see to extend the method to high-order accuracy? 


Theoretical problems 


(8.10) 


(8.11) 


(8.12) 


(8.13) 


(8.14) 


(8.15) 


(8.16) 


(8.17) 


(8.18) 


(8.19) 


Show that Eq. 8.6 is a finite-difference solution to the equation 0,Q — 
ao,,Q = 0 using a forward difference in space. 

Derive the upwind scheme Eq. 8.17 starting with the scalar advection 
equation. 

The stability criterion for the finite-volume method is cdt/dx<1. 
Starting with Fig. 8.7, derive this stability criterion from first 
principles. 

Starting with the advection equation 0,Q—- ad,Q = 0 derive the second- 
order wave equation by applying the so-called Cauchy—Kowalseski 
procedure (see text). 

Following the finite-volume approach based on the divergence theorem, 
calculate the spatial derivative operator for the hexagonal cell shown in 
Fig. 8.21 and functional values defined at three points P,. 

The linear system for elastic wave propagation in 1D (transverse mo- 
tion) is given in Eq. 8.33. The wave equation can also be formulated for 
compressional waves using the compressibility K as elastic constant. Re- 
formulate the linear system for acoustic wave propagation and calculate 
the eigenvalues of the resulting matrix A. 

For either an elastic or an acoustic linear system derive the eigenvectors 
of matrix A; the matrix of eigenvectors and its inverse. 

Show that the superposition of left- and right-propagating stress and ve- 
locity waves (Eq. 8.46) are solutions to the linear system of equations 
(Eq. 8.33) for elastic wave propagation. 

Show that a discontinuity of the form AQ = [1,0] leads to an equi- 
partitioning of two seismic waves propagating in opposite directions. 
Start with the Riemann problem formulated for the homogeneous case 
(Eq. 8.48). 

Derive reflection and transmission coefficients for seismic waves with 
vertical incidence by considering the Riemann problem for material 
discontinuity (Eq. 8.68). 
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Fig. 8.21 Hexagonal grid cell with 
functional values defined at three points. 
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Show that the derivation of the reflection and transmission coeffi- 
cients (Eq. 8.68) is also possible assuming a left-propagating wave with 
eigenvector AQ = [-Z,, 1]. 


Programming exercises 


(8.21) 


8.22) 


8.23) 


8.24) 


8.25) 


(8.26) 


(8.27) 


Write a finite-volume algorithm for the scalar wave equation from scratch, 
implementing both the Euler upwind and the Lax—Wendroff schemes. 
Implement the scheme such that you can easily change between the two 
approaches. Compare the solution behaviour and discuss the results. To 
start, use the parameters given in Table 8.1. Code the analytical solution 
and compare the results with the numerical solution. 

Determine the stability limit of the Euler and Lax—Wendroff schemes for 
the scalar advection equation. 

Create a highly unstructured 1D mesh and investigate the accuracy of the 
finite-volume method (Lax—Wendroff) for the scalar advection problem. 
Investigate the concept of trapped elastic waves by inserting an ini- 
tial condition in a low-velocity region. Use the Lax—Wendroff algorithm 
in 1D. 

Implement circular boundary conditions in the 1D elastic Lax—Wendroff 
solution. Initiate a sinusoidal function f(x) = sin(kx) that is advected in 
one direction. Investigate the accuracy of the finite-volume scheme as a 
function of wavelength and propagation distance by comparing with the 
analytical solution. 

The finite-volume method is supposed to conserve energy in the ho- 
mogeneous case. Use the computer programs for scalar advection, set 
up an example, and calculate the total energy in the system for each 
time step. Check whether it is conserved. Explore this problem for the 
heterogeneous case. 

Scalar advection problem: Advect a Gaussian-shaped waveform for as 
long as you can and extract the travel time difference with the analytical 
solution in an automated way using cross-correlation. Plot the time error 
as a function of propagation distance and your simulation parameters 
(e.g. grid points per wavelength, Courant criterion). 


The Discontinuous Galerkin 
Method 


‘Why yet another method?’ you may ask. With the spectral-element method we 
seem to have given answers to the main simulation requirements in seismology. 
Yet the devil is in the detail, and there are still some cases where even the spectral- 
element method based on hexahedral meshes is not the optimal choice. Before we 
motivate the application of the discontinuous Galerkin method (sometimes also 
called the discontinuous Galerkin finite-element method but I find that too long) to 
the problem of seismic wave propagation, let us recall the pros and cons of the 
methods we have encountered so far. 

The finite-difference method, still a major workhorse, was presented as a sim- 
ple, quite flexible low-order method which, however, suffers from the difficulties 
encountered in implementing boundary conditions with high accuracy for com- 
plex shapes. In addition, because of the regular-grid discretization, adaptation to 
models with strong heterogeneities is difficult. While the pseudospectral method 
allowed the improvement of the spatial accuracy, this was only possible at the 
expense of substantially more operations per time step. Boundary conditions 
were even more difficult to implement efficiently (though this was fixed with 
the Chebyshev approach). For 3D wave propagation the method was more or 
less abandoned in its classic form, because of the required global communication 
scheme making parallelization inefficient. 

With the finite-element method the extension to high-order approximations 
inside the elements was possible and hexahedral or tetrahedral grids allowed 
meshing of geometrically complex models. One of the main advantages of finite- 
element-type methods is the implicit accurate modelling of the free-surface 
(stress-free) boundary condition. However, the drawback is the implicit scheme 
requiring the inversion of huge system matrices. For hexahedral meshes and a 
clever combination of interpolation and integration schemes this problem can be 
fixed, leading to an explicit extrapolation scheme—the spectral element method. 
The finite-volume method can be considered another attempt to become more 
flexible in the choice of model geometry. 

It turns out that computational meshes with hexahedral elements, even with 
curved edges, are difficult to generate when boundaries (e.g. surface topography, 
internal interfaces, faults with complex shapes) need to be honoured. On the other 
hand—as is well known in the engineering community—tetrahedral meshes are 
easily generated for arbitrary shapes (see Fig. 9.1) once the surfaces are known 


Computational Seismology. First Edition. Heiner Igel. 
© Heiner Igel 2017. Published in 2017 by Oxford University Press. 


9.1 History 


9.2 The discontinuous Galerkin 
method in a nutshell 


9.3 Scalar advection equation 
9.4 Elastic waves in 1D 

9.5 The road to 3D 

Chapter summary 

Further reading 


Exercises 


240 


242 
243 
255 
262 
264 
264 
265 


240 


Z 


ans 


Fig. 9.1 Tetrahedral grids. Mesh of the 
famous Matterhorn mountain at the 
border between Switzerland and Italy. 
The colours indicate mesh partitioning 
to different processors. Figure courtesy of 
Martin Kdaser. 


! A very peculiar remote location, near 
a giant extinct volcano 2,300 m above 
sea level in New Mexico. LANL was cre- 
ated during the Second World War for the 
development of the first atomic bombs. 
Amongst other things, today it has a strong 
focus on supercomputing in physics. And 
great bike riding! 

? At the time, Martin Kaser and Michael 
Dumbser shared an office at Trento Uni- 
versity; their collaboration, together with 
Josep de la Puente in Munich, led to an 
impressive, high-speed development of the 
discontinuous Galerkin method for seis- 
mic wave propagation. 


The Discontinuous Galerkin Method 


in parametric form. Therefore, the question arises of which numerical scheme is 
capable of efficiently solving the elastic wave equation on tetrahedral (or generally 
unstructured) grids. 

This is the key motivation that led to the transference of the discontinuous 
Galerkin method to seismology. As the story unfolded, several other beneficial 
aspects of the method became apparent (e.g. the efficient implementation of lo- 
cal time stepping, high accuracy of frictional boundary conditions for dynamic 
rupture problems) that now constitute some of its most attractive features. 

But let’s go back to the beginning. We first review the history of the discon- 
tinuous Galerkin method and then describe the method in a nutshell. This will 
be followed by a discussion of the ingredients of the method and properties in 
comparison with the other methods encountered so far. Finally, we present the 
numerical solution to the elastic wave-propagation problem. 


9.1 History 


The discontinuous Galerkin method was developed in the Los Alamos National 
Laboratories (LANL)! for the problem of neutron transport by Reed and Hill 
(1973), formulated on triangular meshes. Starting in the late eighties, B. Cock- 
burn and co-workers provided a theoretical framework for the discontinuous 
Galerkin method in connection with high-order Runge—Kutta-type time inte- 
gration schemes, summarized in Cockburn et al. (2000). Until then, numerical 
methods for seismic wave propagation problems had primarily been based on 
regular grid methods. 

The desire to solve wave-propagation problems on unstructured grids 
appeared when the first solvers for global wave propagation were developed 
in spherical coordinates (Igel and Weber, 1995; Igel and Weber, 1996). It was 
obvious that for 3D global wave propagation regular grid methods in spherical 
coordinates would not work. Therefore, finite-difference-type operators on 
unstructured grids were considered that could be applied to arbitrary point 
clouds with point densities following the seismic velocity models, keeping the 
number of grid points per wavelength constant in the whole domain (Kadser et al., 
2001; Kaser and Igel, 2001). 

At the time that the first European training network in computational seismol- 
ogy (SPICE) took off in 2004, with a strong focus on seismic forward modelling, 
an efficient solution for wave propagation on unstructured tetrahedral grids was 
still lacking. During a meeting on numerical methods organized by the finite- 
volume expert E. Toro in Trento, Italy, the idea to apply the discontinuous 
Galerkin method to seismic wave-propagation problems came up. A quick first 
glance at the mathematical structure revealed that a transfer from the arbitrary 
high-order approach developed for the aero-acoustic problem (later published by 
Dumbser and Munz (2005a and 20050)) should be straightforward.” 

The first application to elastic wave propagation was published by Kaser 
and Dumbser (2006) for the 2D case (see Fig. 9.2), later extended to 3D by 


Kaser et al. (20076). Further rheological models were incorporated, such as 
viscoelasticity (Kaser et al., 2007a), anisotropy (de la Puente et al., 2007), and 
poroelasticity (de la Puente et al., 2008), extending substantially the domains of 
application. 

Kaser et al. (2008) carried out a detailed analysis of the convergence properties 
of the discontinuous Galerkin method. Substantial progress could be made by 
introducing the concept of local time stepping. Being able to arbitrarily change 
the mesh density by using tetrahedra is a major advantage. However, when Earth 
models are simulated with very strong velocity variations the global time step 
depends on the smallest grid cell and the largest velocity. This implies that often 
large parts of the models are oversampled. The local time-stepping approach by 
Dumbser et al. (20076) circumvents this problem, thereby reducing the overall 
computations. 

The geometrical flexibility of the discontinuous Galerkin method is an attrac- 
tive feature for kinematic and dynamic rupture simulation problems. Kinematic 
rupture scenarios with curved faults were first presented by K4ser et al. (20070). 
de la Puente et al. (2009a) implemented frictional boundary conditions that al- 
low the simulation of dynamic rupture. These were later extended to 3D by Pelties 
et al. (2012). Gallovic et al. (2010) used these concepts to study strong ground 
motion in the presence of geometrically complex faults and topography. 

For elastic wave-propagation problems without strong geometrical complexity 
or material heterogeneities the discontinuous Galerkin method is most likely not 
the method of choice, as finite-difference or spectral-element methods provide 
more efficient solutions. However, for dynamic rupture problems, the discon- 
tinuous Galerkin method is currently the most accurate solver, in particlar for 
complicated fault models (see Chapter 10 on applications). 

Further developments of the discontinuous Galerkin method were carried out 
by the Austin group (Wilcox et al., 2010) using nodai/? basis functions on hexa- 
hedral grids with applications to global wave propagation using the cubed-sphere 
approach (Ronchi et al., 1996). Hermann et al. (2011) introduced an algorithm 
that allowed combining tetrahedral and hexahedral grids. Etienne et al. (2010) 
introduced a discontinuous Galerkin method for tetrahedral meshes using nodal 
basis functions. This methodology was later extended to the problem of dynamic 
rupture by Tago et al. (2012). 

The discontinuous Galerkin method has tremendous flexibility in terms of 
the variation of element sizes (also called h-adaptivity, where / stands for a rep- 
resentative element size) across the computational mesh, as well as the option 
to vary the polynomial order arbitrarily in each cell (called p-adaptivity). As any 
realistic simulation requires the parallelization of the numerical algorithm, it is ob- 
vious that h-adaptivity, p-adaptivity, and an unstructured mesh, combined with 
local time stepping, create a tremendous challenge for load-balancing a large-scale 
simulation on current standard parallel computer architectures. 

In de la Puente et al. (20098) a first analysis of scaling and synchroniza- 
tion of the discontinuous Galerkin method was undertaken. The challenging 
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Fig. 9.2 Wave propagation on triangu- 
lar meshes. Simulation example with 
the discontinuous Galerkin method for a 
vertical force at an inclined free surface 
(Lamb’s problem). Note the flexibility to 
vary the triangle density throughout the 
model. Figure from Kaser and Dumbser 
(2006). 


3 At this point we need to introduce the 
distinction between nodal and modal ap- 
proaches to approximate functions. The 
nodal approach is what we have used so 
far, allowing exact interpolation at some 
set of points. Modal basis functions may 
be orthogonal but do not have this inter- 
polation property. In this chapter we only 
discuss the nodal approach. 
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optimization task piqued the interest of the computational science community 
in Munich, initiating a re-engineering that eventually led to the SezsSol code be- 
coming one of the fastest application codes with close to 50% peak performance 
(exceeding 1 PFlop in 2014). The code SezsSo/ became a finalist in the prestigious 
Gordon Bell Competition in 2014 (Breuer et al., 2014). 

At the time of writing, the discontinuous Galerkin method is primarily 
used for dynamic-rupture and strong ground-motion problems, as well as 
wave-propagation problems with significant geometrical complexity or velocity 
variations. In the next section we will present a snapshot of the method, before 
discussing its details. 


9.2 The discontinuous Galerkin method 
in a nutshell 


In this section we illustrate qualitatively the most important features of the dis- 
continuous Galerkin method. Basically all concepts that feature as components 
of this method have been discussed previously. These include (1) the finite- 
difference extrapolation using, for example, the Euler scheme; (2) the calculation 
of element-based stiffness and mass matrices; (3) flux calculations at the element 
boundaries as encountered in the finite-volume method; (4) exact nodal interpo- 
lation based on Lagrange polynomials as used in the spectral-element method; 
and (5) numerical integration schemes using collocation points. 

We start with the wave equation in first-order form as used in Chapter 8 on 
the finite-volume method. With v as velocity, o = oy) = oy, representing the 
only non-zero stress component, and implicitly assuming space-time dependen- 
cies, the wave equation as a coupled system of two first-order partial differential 
equations reads 


0,0 = [LOxv 
‘ a (9.1) 
pa,v = do4+f. 
This coupled system can be expressed in matrix—vector form: 
0,Q + Ad,Q = f, (9.2) 


where Q = (a,v)" is the vector of unknowns and A contains the coefficients of 


_{ 90 -“ 
A= a (9.3) 


It turns out this is a linear hyperbolic system, with the same form as the classic ad- 


the equation given by 


vection equation. The basic strategy for the discontinuous Galerkin method is the 


same as for the finite-element method; multiplying the equation by an arbitrary 
test function combined with describing the unknown fields with the same set of 
basis functions (the Galerkin principle). 

The main differences come with the freedom to allow the unknown fields to be 
discontinuous at the element boundaries. Obviously, the elements need to com- 
municate information across the boundaries and this is achieved through a flux 
scheme based on solutions of the Riemann problem which we encountered in 
Chapter 8. What makes the discontinuous Galerkin method so powerful is the 
fact that the formulation leads to an entirely local scheme even for high-order 
extensions. 

This principle is illustrated in Fig. 9.3. The wavefield inside each element is 
described by Lagrange polynomials exactly interpolating at appropriate colloca- 
tion points. At each time step, a flux term F has to be evaluated at all element 
boundaries. The extrapolation scheme of the discontinuous Galerkin method can 
be expressed as 

QO = MY) KOO -FQ‘O)), (9.4) 
where Q*(t) is the vector of unknowns, M* and K* are the elemental mass and 
stiffness matrices respectively, and F*(-) is the vector containing the flux terms 
at the left and right boundaries. The upper index k denotes the element (source 
term is omitted). 

The fact that the elements are only connected through the boundary fluxes, 
and there is no global system of equations to solve, has important implications: 
(1) We obtain a fully explicit scheme which lends itself to element-based paral- 
lelization; (2) the choice of element size is arbitrary and has no impact on the 
solution algorithm (h-adaptivity); (3) the polynomial order in each element can 
be arbitrarily chosen and again has no impact on the algorithm (p-adaptivity); 
(4) the fact that we have to consider the boundary points twice to calculate the 
fluxes (see Fig. 9.3) implies an increase in the number of degrees of freedom that 
of course gets worse with increasing dimensionality. 

From the above we can appreciate that the discontinuous Galerkin method 
in the nodal form is very close to the spectral-element method, with the only 
difference being the flux terms. There are many choices for how these fluxes can 
be evaluated. We will proceed with a presentation of the discretization scheme, 
which leads to the complete algorithmic implementation of the method for the 
elastic wave equation. 


9.3 Scalar advection equation 


As already discussed in the chapter on the finite-volume method we can treat 
the 1(2,3)D wave equation just like the classic advection equation as a hyperbolic 
partial differential equation. We proceed by seeking a solution of the scalar linear 
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Fig. 9.3 Principle of the nodal dis- 
continuous Galerkin method. Bottom: 
Displacement Snapshot of 1D wave 
propagation through a_ heterogeneous 
medium. Top: Schematic (exaggerated) 
representation of the displacement field 
in three adjacent cells. The squares de- 
note the Lagrange collocation points at 
which the fields are exactly interpolated. 
Note the varying size of the elements 
(h-adaptivity) as well as the different 
interpolation orders (p-adaptivity). The 
most important aspect 1s the discontinu- 
ous behaviour at the element egdes that 
is treated with a flux scheme. 
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1 2 . R-lR k+1 ... on 
> x D; 


Fig. 9.4 Indexing scheme for discon- 
tinuous Galerkin discretization scheme. 
The physical domain of each element is 
denoted by Dy, with left and right bound- 
aries x* and x*, respectively. Note the 
varying element sizes (h-adaptivity). 


advection equation using the discontinuous Galerkin approach. This will be fol- 
lowed by a generalization to the vector—matrix problem through the eigensystem 
analysis introduced in the previous chapter. 

The solution vector for the coupled elastic wave equation was denoted by 
Q(x, 2). In this section we use g(x, t) for the unknown scalar solution and a for 
the given (possibly space-dependent, here positive) advection (wave) velocity to 
obtain the source-free advection equation 


0,q(x, t) + a O,q(x, t) = 0, (9.5) 


which we proceed to solve using the discontinuous Galerkin approach. The 
discretization scheme in 1D is illustrated in Figure 9.4. The space domain is 
divided into 7 elements, each element having the physical domain x € D, limited 
by left and right boundaries Lx, x*] with element size h® = xt - xf Like in 
other finite-element-type schemes, the element size h* for every D, is variable 
(h-adaptivity). As indicated in the introduction to this chapter there are two 
approaches to formulating the approximation of the unknown field q(x, t) inside 
each element: (1) the nodal, and (2) the modal approach. The nodal approach 
is slightly simpler in its formulation and allows a direct comparison with the 
spectral-element approach. 

The solution field inside our element will be formulated using the same basis 
functions we encountered in the spectral-element method: Lagrange polynomials. 
To understand the spatial discretization scheme for the moment it is sufficient to 
know that for polynomials of order N each element has N, = N + 1 collocation 
points. The two end points coincide with the boundaries. We do not assume con- 
tinuity at the boundaries between elements. Therefore, we have two values (in 1D 
one defined in the left element, one defined in the right element). While we see the 
theoretical consequences later, this also implies that the number of degrees of free- 
dom is larger. We illustrate this by presenting the shape of the solution vector and 
the vector of collocation points in the case of the scalar advection equation as it ap- 
pears later in the implementation. The scalar solution matrix in 1D q is given as 


qj q; wise, Gy 
, gg... @ 


qi = > (9.6) 


1 2 n 
IN, INp ++ > INp 


where j = 1,...,Ny, is the number of points per element and k = 1,...,” the 
element number. The collocation points at which the solution is calculated are 
GLL points stored as 


1 2 1 n 

xy xy = XN xy 

1 2 n 

h X> x5 oh % Bo) 


with the same size as qG. The last point of the first element and the first point of 
the second element coincide, etc. Again it is important to note that the g(-) values 
defined at these points do not have the same values (see Fig 9.3). In fact this is 
the discontinuity in our method’s name. We now proceed with the derivation of 
the weak form following the discontinuous Galerkin approach. 


9.3.1 Weak formulation 


To obtain the weak formulation of the scalar advection equation we multiply 
0:q(X5 t) + a dxg(x,t) = 0 (9.8) 


with a general test function ¢;(x), integrate over the kth element domain D, to 
obtain 


J aac nenax+ | ad.aenneodas = 0. (9.9) 
Dp Dp 
and integrate by parts* replacing the right term containing the space derivative by 


i} adeg(x, Dod = [ag(x, Db) 
ze (9.10) 
e / a1, 18,3 (x) dx, 


Dp 


where x, and x; are the right and left boundaries of element k, respectively. We 
assume constant velocity a inside the element. 

In the derivation of the finite- (spectral-) element formulation of the second- 
order wave equation the term in square brackets contains the gradient of the 
wavefield. Assuming stress-free boundary conditions at the edges of the physical 
domain this term vanishes, leading to the implicit implementation of the free- 
surface boundary condition. In that sense, this is the important point of departure 
from the classic finite-element concept onto which we want to shed a little more 
light. What does the integration by parts look like with more than one dimension? 


[ aaueac =a uvn;dT 
Q r 
-{ Ud x, VAQ, 
Q 


where u, v are arbitrary space-dependent functions, Q denotes the entire volume, 


The definition is known as 


(9.11) 


I’ its boundary, and n; is a vector normal to the boundary. Note that by setting 
the function wz = 1 in the above equation we recover Gauss’s theorem by equating 
the volume integral of the divergence of a vector field with the surface integral 
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4 Integration by parts is defined as: 


fw = [qu] - fv. 
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Fig. 9.5 Integration by parts and 
Gauss’s theorem. In higher dimensions 
> 1D the partial integration rule leads 
to Gauss’s theorem (see text), illustrating 
that fluxes at the boundaries need to 
be considered for the discontinuous 


Galerkin ansatz. 


> Remember the Lagrange polynomials 
are defined as 


io) = |] — 


1<m<Np jm 
mj 


(9.13) 


See Fig. 7.8 for an illustration. For 
polynomial order N summation goes to 
N,=N+1. 

® Named after Boris Galerkin (1871- 
1945), a Russian mathematician and en- 
gineer, who, however, referred to Swiss 
mathematician Walther Ritz (1878-1909) 
as having discovered the method. 
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over the field itself at the closed boundary. The physical significance of this is 
important for our numerical method. The surface integral gives the net outflow 
of the field across the boundary. This justifies calling it the flux term, which will 
occupy us later for some time (see also Fig. 9.5). 

Putting this result back into the advection equation we obtain in general for 
element k 


y datwid, Code / aig, 4) Bushy lod dx 
Dr Dp (9.12) 


= —[aqg@, NO) ]¥5 


where the right-hand-side term is evaluated at the left and right element bound- 
aries x; and x,. 

So far, our unknown field g(x, t) is a continuous function. In the next step— 
identical to the procedure for the spectral-element method-—we replace q(x, t) by a 
finite polynomial representation in terms of a weighted sum over Lagrange poly- 
nomials> inside each element k of order N denoted as ¢;(x),i = 1,...,N +1 
defined in the interval x € [-1,1]. The exact interpolation property of the La- 
grange polynomials at the GLL collocation points x; (as presented in detail in 
Chapter 7 on spectral elements) means that the approximation (x) of function 
y(x) implies 9(x;) = y(x;), where 


Np Np 
= > 7 vGe eed = >" Gu 65. (9.14) 
jel j=l 
For element & we obtain 
Np 
CA) ee BC ORC (9.15) 


i=1 


where in each element the polynomial order may vary, and x = x; are the colloca- 
tion points. To make notation easier we keep the same symbol g(x, t) from now on 
for the discrete polynomial representation of the original continuous dexact (x; 2). 
N, = N +1 denotes here in general the upper index required for polynomial 
order N. Again, there is a fundamental difference to the other approaches. This 
so-called p-adaptivity (i.e. varying the polynomial order in different elements) is 
in general also possible in higher dimensions and is one of the most attractive 
features of the discontinuous Galerkin approach. 

In addition to replacing the unknown field by Lagrange polynomials, we also 
use them as test functions. This is the well-known Galerkin approach,° giving the 
method its name in combination with the discontinuous behaviour at the element 
boundaries. 

Combining Eq. 9.15 and the left-hand side of Eq. 9.12 we obtain 


[ a Sraot (X)6j(@)dx- ia Soaioe (3) xb) (dx, (9.16) 


i=1 


and after re-ordering 


Np 


>| [ ene ar| aac -| y ati (s)as «to. (9.17) 


i=1 


We recognize the familiar ingredients of this equation; the elemental mass M;; 
and stiffness K, matrices with similar form as in the spectral element method. 
Assuming implicit matrix—vector operations we obtain 


M 4,q(0) — KT qo), (9.18) 


where the matrices are given by 


My = [ £;(x) £;(x)dx 
* (9.19) 


Ky = a | £;(x) 0; (x) dx, 
Dp 


assuming constant velocity a inside element k. Note that the lower index Dy, indi- 
cates that we are still in physical space and we have to map to the local coordinate 
system. How are the elements of mass /; and stiffness Kj matrices calculated in 
detail? 


9.3.2 Elemental mass and stiffness matrices 


The local matrices (nodal or modal approach) for arbitrary test functions ¢;(x) 
are defined by 


Mes yi SiDOi( a 
(9.20) 


Ky = a / Pi(X) 0xGj (x) dx 
Dp 

containing integrals over (derivatives of) the test functions ¢(x). These integrals 

can in general not be calculated analytically and we have to employ a numerical in- 

tegration scheme. We proceed with the same approach as in the spectral-element 

method and replace the integrals by a weighted sum over the function values f(x;) 

at carefully chosen points x; inside the elements 


N 
[1 ax Disord, (9.21) 
i=1 


In the nodal case with Lagrange polynomials, obviously the best choice is to use 
the GLL collocation points, leading to a diagonal mass matrix as we will see below. 

As in any finite-element type method, we need to map our physical coordinates 
into an element-based system. In 1D this is quite simple using & as local variable 
and transforming via 
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(+8) 4 


x* (E) = xh + 5 


€ ¢[-1,1), (9.22) 


where xf and x* are the left and right physical boundaries of element k, respec- 
tively, and h* is the element size. In general the mapping of the differential used 
to evaluate integrals is called the Jacobian, which is defined for element k as 


d k_ Rk he 
Din eee (9.23) 
dé Pei 


f= 


where h' is the size of element &. For the elemental matrices we obtain for arbitrary 
test functions 


Mi = i HOOT ds 
Ki = a [ bE) (PY IG (E) dé (9.24) 
=a [ bi(E) ej (dk 


and we note that for the calculation of the stiffness matrix the Jacobians cancel 
out. Finally, we can replace the test function with the Lagrange polynomials of 
order N leading to the definition of the mass and stiffness matrices: 


1 Np 
Mi = ie (EO; (E) F* dé = oo Lilt On) F 


m=1 
Np 
-_ be Wm dim Sim Gm 
m=1 
_ |e; Hy ifi=j 
0 ifiFj 
1 Np 
Ky = i. al; (&) 0,0; (E) dé = yoa Wn £5 (Xn) Ox; (Xm) 
= m=1 
Np (9.25) 
= ey a Wm dim xb; (Xm) 
m=1 


=a 1; 0x; (x;) s 


Note, as previously obtained for the spectral-element method, the beautifully sim- 
ple structure of these matrices, in particular the diagonal mass matrix. Again, this 
was obtained by combining the Lagrange interpolation with the GLL integration 
scheme. The fundamental difference from the classic finite- (or spectral-) element 
method is that we only determine these matrices at an elemental level. We do not 
need to assemble them to a global system matrix. 


Even though we aim at having each numerical method presented indepen- 
dently at this point we refer to the earlier Section 7.4.3 for the calculation of 
the derivatives of the Lagrange polynomials at the collocation points 0,J;(x;). 
The same routine developed for the spectral-element method can be used. The 
integration weights required are given in Table 7.1. 

We are not ready yet. At this point we have formulations for our unknown field 
inside the elements but no information is transferred from one element to another. 
Essentially, the values at the element boundaries are defined twice and it is not 
clear how to calculate and connect them (see Fig. 9.6 for a sketch). Therefore, we 
introduce the concept of fluxes, to solve the remaining term in Eq. 9.12. 


9.3.3 The flux scheme 


As the flux concept is the key difference between the classic Galerkin methods 
(finite or spectral elements) and the discontinuous Galerkin concept, it deserves 
some special attention. Let us start with the right-hand side of Eq. 9.12 stated 
in the original integral form that also holds for higher dimensions (see discussion 
above and Eq. 9.11): 


/ a q(x, t)o;(x) n dx, (9.26) 
aDg 

where n = +1 denotes the vector normal to the boundary, in 1D taking the values 
n = -1 and n = 1 at the left and right boundaries, respectively. We proceed 
systematically as above and replace the space-dependent part of g(x, 1) by a sum 
over Lagrange polynomials (Eq. 9.15) to obtain 


Np 
Df GO) 409 aa nar, (9.27) 
i=1 k 


also replacing the arbitrary test function with Lagrange polynomials. The orthog- 
onality of the Lagrange polynomials and the fact that we are integrating over the 
surface dD, (which in 1D consists of the left and right boundaries) leads to 


Np 
SiGe) G8) (a gat D)* - Lh) GO) (a ge}, )*) 
a=1 


= &) (xh) (a a(xf )* — Ga ap)” (9.28) 


where we introduced the starred terms (a gtxk )))* that are the presently undefined 
values at the element boundaries (see Fig. 9.7). Note that in the general integral 
formulation above the surface normal n leads to the signs of the two terms in 
the bottom line of Eq. 9.28. The expression above corresponds to a flux vector 
inj = 1,...,.N, with only the boundary values xh, x* being non-zero, and we 
denote it as 
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k-1 k k+1 


Fig. 9.6 The discontinuity at the ele- 
ment boundaries. The key task in the 
discontinuous Galerkin scheme concerns 
the question of what values to allocate 
to the points at the element bound- 
aries. This involves the use of flux 
schemes originating from fintte-volume 


techniques. 


k-1 k k+1 


Fig. 9.7 Illustration of the interpola- 
tion scheme for element boundaries. In 
Eq. 9.28 the terms €; Ca )) single out the 
boundary points. The values of the solu- 
tion field q(x, t) at a certain time level is 
determined by the flux scheme. Lagrange 
polynomials €1,n are plotted with GLL 
collocation points for N = 4. 
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Fig. 9.8 Illustration of fluxes in the 
scalar case that will modify the values of 
element k. The unevenly spaced Gauss— 
Lobatto—Legendre points are indicated 
by dots. Refer to Eq. 9.32. 


F=| : |. (9.29) 
Fy, 


The key question is how to determine the specific flux values F, and Fx, 
for each element. As discussed in Chapter 8 on finite volumes, this is called the 
Riemann problem and deals with the question on how to transport a discontinuity 
(knowing that we are solving an advection problem). 

A natural choice seems to be to take the average of the values on both sides of 
the boundaries, which is called the central flux F° and can be expressed as 


1 
Fy = sag@st",1) + Gt.) 
1 (9.30) 
Fy, = 5a€Q@4.n + a@f"50) 


for left and right element boundaries, and a is the velocity in element &. The more 
stable choice is the so-called upwind flux F“? that basically uses the information 
from the direction it is coming from. In the scalar advection problem there is only 
one speed a. In this case we obtain 


up __ 
Fy = 


a q(x*) if a<O (1) 
a q(x*?) if a>O (2) 


(9.31) 
up _ }@ a at ee. 1) 
Np aq(x*) if a>0 (4), 


where we basically use the boundary value of the neighbouring (or current) ele- 
ment depending on the sign of the advection velocity (see Fig. 9.8). Both central 
and upwind fluxes can be formulated in a compact way, convenient for coding. 
This formulation reads (implicit time 1) 


Ie) 
2 
|a| 


1 
Fn, = a5 (qr) + g(xf*")) + aut — ow) (q(x*) — q(x#*)), 


1 
F, =-a5 (a(xp) + 1) — == (1 - a) (ge) - a(x) 


(9.32) 


where a = 0 corresponds to the upwind flux and a = 1 to the centred flux scheme. 

At this point we have all the ingredients (except the straightforward time 
extrapolation) to allow us to write our first discontinuous Galerkin solver and 
investigate its properties. 


9.3.4 Scalar advection in action 


In this section we will turn the algorithm developed so far into a computer pro- 
gram, and investigate its performance. In matrix notation we obtained for one 


element 

Maq() - K"q@) = -F@aq), (9.33) 
requiring an extrapolation scheme of the form 

ag(t) = M'(K" q(t) -F(@a()), (9.34) 


where F(a, q(t)) is the flux vector as defined above. We seek to extrapolate the 
system from some initial condition and obtain for each element, using the simple 
Euler method, 


Uns1) © qtr) + dt[M (Kg) - Fa, q@))], 


where for the flux scheme F(-) we use the upwind approach. Note that this is 


(9.35) 


a local (i.e. usually tiny) elemental system of equations. Communication to the 
outside world happens only through the flux vector F(-) according to Eq. 9.32. 
It will turn out that the low-order Euler scheme (which actually works fairly well 
for the velocity—stress formulation using the finite-difference scheme) is pretty 
useless for the discontinuous Galerkin formulation as it becomes unstable even at 
low Courant values. Therefore, we employ a high-order extrapolation procedure 
known as the predictor—corrector method (or Heun’s method, or two-stage Runge— 
Kutta method).”? Now we are in a position to put everything together and write 
our first discontinuous Galerkin solver (at least) for the 1D advection problem 
that we will, with very little modification, later extend to the problem of elastic 
wave propagation. 

The most important initialization step is the calculation of the elemental matri- 
ces, mass M and stiffness K. The following Python code part illustrates a possible 
implementation looping through all elements ve and calculating these matrices as 
a function of the Jacobian ¥(k) = h*/2 where h' is the size of element : 


# Initialize vectors, matrices 


Minv = zeros([N+1,N+1,ne] ) 
zeros ([N+1,N+1,ne] ) 
= zeros([N+1,ne] ) 
# [...] 
k in range (ne): 
for i in range(N+1): 
Minv[i,i,k] = 1. / wli] * J[k] 
He osgta:') 
k in range(ne): 
for i in range (N+1): 
for j in range (N+1) : 
K[i,j,k] = alk] 


* wli] * lid[j, i] 
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7 The general formulation for the 
predictor—corrector method given the 
problem 0,y = f(t,y) is the following. At 
time step ¢; using time increment dt 


ky =f (isi) 
hk, = f(t; + dt, y; + dtk,) 


1 
Diver = Mi + Phd +k), 


where in our case f(-) corresponds to the 
right-hand side of Eq. 9.34. Have a look at 
the code example to see how this works in 
practice. 
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Fig. 9.9 The matrix—vector form of the 
discontinuous Galerkin method. The 
system of questions at an elemental level 
is illustrated by plotting the absolute 
matrix/vector values. The corresponding 
equation is given at the bottom. Only 
the calculation of the flux vector F ne- 
cessitates communication with adjacent 
elements. 
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Note the dimensions of the solution vector g and the matrices, depending on 
the highest degree N of the polynomial approximation leading to a scheme ac- 
curate to order N + 1 and having N + 1 degrees of freedom, that is, points per 
element. The diagonal mass matrix is kept as a matrix for illustrative purposes, 
but of course can be stored as a vector. The subroutine /1dQ is the same as in 
the spectral-element method and provides the derivatives of the Lagrange poly- 
nomials at the collocation points. Here, we assume that the polynomial order is 
constant; however this can be easily modified (see exercises). An illustration of 
the shape and structure of these matrices is given in Fig. 9.9 for an example with 
N = 3, N, = N+1 = 4 and values taken from a simulation. The mass matrix M 
is diagonal, the stiffness matrix K is full, and the flux F vector non-zero only in 
first and last element. 

Inside the time loop the flux vector is responsible for the spatial interaction 
i.e. transport). The flux vector has to be calculated anew for each time step (or 
intermediate step when using high-order extrapolation schemes) as it depends on 
the current values of the wavefield g. An implementation following the definition 
of Eq. 9.32 is 


# Flux calculation 
F = zeros ([N+1,ne] ) 


#. Leds] 
for k in range(ne) : 
F[O,k] = -0.5 * a *(q[0O, k] + q{[N+1,k-1]) \ 


-0.5*abs (a) *(1l-alpha) *« (q[N+1,k-1] + q[0,k]) 
F[N+1,k] = 0.5*a*(q[N+1,k] + q[0,k+1]) \ 
-0.5*abs (a) * (1l-alpha) *(q[N+1,k] + q[0,k+1]) 


where the variable alpha can be used to change the flux scheme from central 
(alpha = 1) to upwind (alpha = 0). Note that in the nodal form of the discontinu- 
ous Galerkin method the flux vector has only non-zero values at the first and last 
elements. 

The element-wise system of equations can be extrapolated by the Euler 
scheme as 


# Extrapolation for every element 
for it in range(nt): 
for k in range(ne): 
q({:,k] = q[:,k]+dt*(Minv[:,k]@(K[:,k] .T@q[:,k] - 
F[:,k])) 
saan ewes 


where nt is the overall number of time steps, dt is the global time increment, 
and ne is the number of elements. Note that we can directly update the solution 
vector g without intermediate storage at different time level(s). A high-order 
extrapolation scheme like the predictor—corrector method can be implemented as 


for it in range(nt): 
He Les ce 
# Predictor corrector scheme 
# Initialize flux vectors F for all k 
# [...] 
# First step (predictor) 
for k in range(ne): 
k1[:,k] = Minv[:,k]@(K[:,k] .T@q[:,k]-F[:,k]) 
#2 [eds 
# Initialize flux vectors F for q + dt*kl 
He Ls 3s es 
# Second step corrector 
for k in range(ne): 
k2[:,k] = Minv[:,k]@(K[:,k] .T@(q[:,k]+dt*k1[:,k]) 
-F[:,k]) 


H+ [sven 
# Update 
for k in range(ne): 
gl:,k] = g[:,k] + 0.5*dt*(k1[:,k] + k2[:,k]) 
# [...] 


We can see that the price for a high-order extrapolation is basically another 
solution of the forward problem. 

Let us take a concrete example and compare the discontinuous Galerkin ap- 
proach to other methods. The simulation parameters for our example are given in 
Table 9.1. Note that, to keep it simple, we initiate the simulation with a spatial ini- 
tial condition using a Gaussian function e! o*(x-x0)” A source term can be added 
to the system of equations in a straightforward way (see also the spectral-element 
method). 

The results of the simulations are illustrated in Fig. 9.10. In this figure we 
compare numerical solutions for the same linear advection problem using four 
different algorithmic implementations: (1) the simplest upwind finite-volume 
scheme (equivalent to the finite-difference method); (2) the Lax—Wendroff 
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Fig. 9.10 Comparison of numerical 
schemes for the advection equation. The 
parameters of the simulations are given 
in Table 9.1 and in the text. The title 
of each plot indicates the method and 
the relative misfit energy. Snapshots of 
the advected Gaussian waveform are 
superimposed for seven different time 
steps (solid lines). Propagation 1s from 
left to right. Initial condition in bold. 
The analytical solution is superimposed 
at each time frame (dashed line). See text 
for details. 


Table 9.1 Simulation parameters for 1D discontinuous Galerkin advection 


Parameter Value Meaning 
ne 200 elements 
N 3 order 

a 2,500 m/s velocity 
Nine 10,000 m x-domain 
AX nin 13.82 m increment 
dt 44x 107s time step 
eps 0.08 Courant 
oO 300m Gauss width 
xo 1,000 m source x 
Tmax 358 duration 
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finite-volume scheme (accurate to second order); and the discontinuous Galerkin 
scheme using (3) the Euler method and (4) the predictor—corrector scheme. 
The advecting field is superimposed at seven different times, equal for all four 
different methods. This presentation allows us to visually appreciate the stability 
(or lack thereof) of the original waveform given at the left of the spatial domain in 
bold. Remember that the analytical solution of the advection problem predicts that 
the initial waveform should be advected without any change in shape at all. This 


clearly does not seem to be the case in the topmost example, where the upwind 
finite-volume approach is shown. This numerical solution is characterized by 
strong numerical diffusion (see Chapter 8 on finite volumes) and is thus of little 
use for high-accuracy calculations. 

A much improved finite-volume solution is obtained by the Lax—Wendroff ap- 
proach. However, if one looks carefully, an increasing time shift of the numerical 
solution compared to the analytical solution can be seen. The Euler-based discon- 
tinuous Galerkin scheme shows unstable behaviour in the wake of the advected 
waveform. Keeping the Courant criterion the same and implementing the high- 
order predictor—corrector method leads to a much improved solution, easily the 
most accurate of the four presented schemes. 

At this point it is instructive to use this simple discontinuous Galerkin scheme 
to exploit the options concerning the p- and h-adaptivity (see exercises). We aim 
at providing a solution to the elastic wave equation in 1D. We have shown in the 
chapter on the finite volume method that the coupled velocity—stress system of 
equations is formally equivalent to the advection problem. In the next sections we 
will develop the solution for this case. 


9.4 Elastic waves in 1D 


We recall the source-free elastic wave equation in 1D with unknown velocities 
u(x,t) and stresses o(x,1), and the elastic model defined with shear modulus 
(x) and density p(x). With implicit space-time dependencies the coupled system 


reads 
0,0 = [OxV 
: g (9.36) 
pov = Ix, 
which can be written in matrix—vector form as 
0,Q + Ad,Q = 0, (9.37) 
where Q(t) = (o(f),v()) is the vector of unknowns and A(x) contains the 
coefficients defined as 
Re eee), (9.38) 
—w(x) 0 


Vectors and matrices are given in bold letters. 

We follow the same procedure as above, developing the weak form of this equa- 
tion assuming an arbitrary test function ¢;. The system of equations is multiplied 
with the test function ¢; and integrated over the physical domain D, represent- 
ing a 1D element (and not the entire physical domain like in the finite element 
derivation): 


/ 0,Q(x, Dd; (x) dx +f A 0,Q (x, 1)g;(x) dx = 0. (9.39) 
Dp Dz 
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8 Because of the interpolation properties 
of the Lagrange polynomials the coeffi- 
cients correspond directly to the values at 
the GLL points &;. 


Integration by parts of the second term, omitting most space-time dependencies, 
leads to 


i. 0,Q¢; (x) dx -[ A Qad,.;(x) dx + AQ¢;(x)ndx = 0, (9.40) 
Dp Dp 


aDp 


where n denotes a normal vector at the element boundaries. This continuous 
notation is still quite general. For multi-faceted element shapes the boundary in- 
tegral over 0D, has to be evaluated separately taking into account local normal 
vectors. The first two terms are well known from any Galerkin-type methods we 
have already encountered. In our 1D system the right term in Eq. 9.40 is an an- 
tiderivative and corresponds to the evaluation of AQ at both element boundaries. 
These are the flux terms that will be detailed in the next section. 

To develop the full discrete scheme we have to introduce the basis functions 
for our solution field inside the elements. We follow the same approach as in the 
scalar advection case (and the spectral-element method) and introduce a nodal 
representation for the two-element solution vector Q(x, 2). We replace the un- 
known continuous field Q(x, 2) by a sum over the same nodal basis functions as 
we employed as test functions. Again we use Lagrange polynomials of order N. 
Following notation of earlier chapters this leads to N, collocation points (GLL) 
for € € [-1, 1]. With Nth order Lagrange polynomials @;(€) we obtain 


Np 
QED =) QE, HEE). (9.41) 


i=1 


Following the convention of earlier chapters we use the same letter Q for the 
coefficients of the polynomials.® 

Note that despite the fact that we are solving a 1D physical problem Q is 
a 3D matrix with size [ne,N,,2] where ne is the number of elements, and N, 
the number of GLL collocation points inside an element. The third dimension 
corresponds to stress and velocity values at the collocation points. For example, 
element & has solution values 


Q,. Gis 0) = ( (9.42) 


0 (1,1) (28)... 0 EN,» t) 
v5) v(a50) ... V(ENp> t) 


at the discrete time level ¢. 

We are ready to assemble our full system of equations by inserting the discrete 
approximation of Q into the weak form of the wave equation. Defining a matrix 
Flux with the same shape as our solution field Q as 


Flux = i AQG;(é)ndé, (9.43) 
aDp 


which we will detail later, we re-write Eq. 9.40 as 


(9.44) 
-A Q&;, 1) i £;(E)d¢0;(E) d& = — Flux, 


Y @.QEs 1) a Ci(EVGEDT ae 


where fF is the Jacobian ¥ = dx/d&. We recognize the well-known mass matrix M 
and stiffness matrix K 


M= / CEE (EF ab 
Dp 

(9.45) 

K= / 6:(€)a¢6)(€) dé 
Dp 


which are calculated in the same way as in the scalar case. Finally, the semi- 
discrete scheme can be written in matrix—vector form as 


Ma,Q = AKQ - Flux. (9.46) 


By applying a standard first-order finite-difference approximation to the time 
derivative (Euler scheme) we obtain 


Q”*! = Q”" + dtM"! (AKQ - Flux), (9.47) 


which is the scheme that we will implement in our code. The predictor—corrector 
scheme can be implemented by analogy with the scalar case (see supplementary 
electronic material). 


9.4.1 Fluxes in the elastic case 


We have not completely finished yet; we must draw one more time on results from 
the finite-volume method. To make notation light, let us consider the situation for 
one element (avoiding subscripts k everywhere). We assume the coefficients of 
matrix A to be constant inside the element. To recall,? A is defined as 


QO - 
Abe ( eis (9.48) 
-- 0 
p 
and it can be diagonalized by 


A=RAR, (9.49) 


where 


R= (i a (9.50) 
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in much more detail in the previous chap- 
ter on the finite-volume method, elastic 
case. 
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with impedance Z = pc = p./p/p, (Z,1)? and (-Z,1)" representing the 
eigenvectors. The diagonal matrix A 


ee & ; (9.51) 


contains the eigenvalues of matrix A with Fc = + ie . We define a matrix |A| 
such that 


_ 1_(¢9)_(VH/e 0 
|A] =R/A|R = (52) =( 6 5) (9.52) 


and separate A into positive (right-) and negative (left-propagating) eigenvalues 
to obtain the definition of A~: 


2 "7 —C 
(9.53) 
At=RaAtR a1 [ ¢ - 
2 =7 Cc 


The A* take the meaning of advection velocities in the scalar case previously 
described. With 


A=At+A (9.54) 


we arrive at the definitions that are commonly used in discontinuous Galerkin and 
finite-volume flux formulations: 


At = tas|ap 
; (9.55) 
Av = 5(A-|A)). 


How does the flux scheme work for element k? The term we previously called 
Flux (Eq. 9.43) leads to four flux contributions for left and right sides of the 
elements by analogy with the scalar case and the finite-volume elastic case, and 
we obtain 


AQE)(E)ndé = 
Dp 
1 
-A-Q' / 6(E)G;(E) a 
Dp 


+atah / GOGO ae (9.56) 


r 


-A‘Q™! | e)6(&) dé 
aDp 


1 
+A Q/ Es C:(E) 6) dé, 


where the integral superscripts denote the point (in general boundary) at which 
the integral has to be evaluated. The latter four integrals in the system of equations 
correspond to matrices F’” defined as 


F! 


1 
/ 6(E)E;(E) 
a (9.57) 
F’ 


, 6(E)G() dé. 
dD 


In the nodal case F” are of shape N» x Np and basically single out the points at 
which the fluxes are evaluated (boundaries). Due to the definition of Lagrange 
polynomials £;(€) = €;(-1) = 1 or £n, &) = xn, (1) = 1. Thus we obtain 


10...0 00...0 
00...0 00...0 

Fiz . {Fs (9.58) 
0000 0001 


With these definitions we can specify the Flux matrix that will be calculated in 
the final algorithm at each time step. We obtain 


Flux = -A, QF’ + Af Q’F"- A; Q*'F! + AZQP*'F’, (9.59) 


where we indicate with subscript & that the coefficient matrix A may vary for each 
element, allowing heterogeneous media to be simulated. 

We can further illustrate the meaning of the indices /, r in the flux formulation 
of Eq. 9.59 in the context of an upwind flux (see Fig. 9.11). When we are at the left 
boundary of an element, the wpwind scheme implies that for a right-propagating 
wave A* we have to consider the element to the left Qt (inflow), and for a left- 
propagating wave A™ we use the value inside the current element Q’ (outflow), 
and accordingly for the right boundary. 

In the code implementation, the flux matrix has the same size as the solution 
field Q with ne x N, x 2 elements, where ne is the number of elements, N, the 
number of collocation points for a scheme of order N, and the last dimension 
contains the values of stress and velocity at each collocation point. 

In a pseudo-code form the flux matrix for element k, calculated at each time 
step, looks like 


Flux(&, 1,:) =-A,Q?-AjZQ*" 


Flux(a, 7, :) 0, j= 2,...,Np4 (9.60) 
Flux(k, N,,:) = AfQ* + A;Q/*', 


with implicit matrix-vector multiplication. We will further illustrate this below in 
the simulation example. 
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Fig. 9.11 Flux scheme in the elastic 
case. For two possible propagation direc- 
tions fluxes have to be calculated at each 
element boundary for the two solution 
fields o and v. That makes 8 flux terms 
in total in the 1D case for each element. 
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9.4.2 Simulation examples 


Finally, let us show how the algorithm works in practice. The beauty is that the 
discontinuous Galerkin method can basically be assembled using modules from 
the spectral-element method (Lagrange polynomials, their derivatives, numerical 
integration) and the finite-volume method (flux scheme). 

The algorithm for the extrapolation of the solution matrix Q (Euler scheme) is 


Q*! = Q! + dM! (AKQ - Flux), (9.61) 


with the flux terms given in the previous section. In the following we show several 
code snippets in Python that illustrate the code structure. The initialization of 
mass and stiffness matrices is omitted as these are identical to those used in the 
spectral-element method. Please refer to the supplementary material for details. 

The solution field Q and the relevant coefficient matrices Ap and Am for 
positive and negative directions respectively, are initialized with the following 
shapes: 


# [...] 

# Initalize solution vectors 
Q = zeros((ne, N+1, 2)) 
Qnew = zeros((ne, N+1, 2)) 

# initialize heterogeneous A 


Ap = zeros((ne,2,2)) 
Am = zeros((ne,2,2)) 
R = zeros((ne,2,2)) 
L = zeros((ne,2,2)) 


where ve is the number of elements and WN the order of the solution scheme. Once 
the element-dependent impedances Z; = p;c; are initialized the system matrix A 
can be decomposed into positive and negative parts as coded below: 


# Initialize flux matrices 

for i in range(1,ne-1): 
# Z[i]=rho[i]*sqrt (mu/rho [i] ) 
# Left side positive direction 
R = array(([(Z[il, -Zlil], [1, 1]]) 
L = array([[0, 0], [0, cli]]]) 
Ap[i,:,:] = R @L @ linalg.inv(R) 
# Right side negative direction 
R = array([([Z[i], -Zlill, [1, 1]]) 
L = array([[-c[i], 0], [0, 0]]) 
Am[i,:,:] = R @ L @ linalg.inv(R) 


The matrices Am and Ap enter the function flux that returns the overall fluxes 
with the same shape as the solution field Q. This routine is called at each time step. 


Further variables that are passed are the current solution matrix Q, the number 


of elements ve and the order of the scheme N. 


def flux(Q, N, ne, Ap, Am): 
# [...] 
# for every element we have 2 faces 
# to other elements (left and right) 
out = np.zeros((ne,N+1,2) ) 
# Calculate Fluxes inside domain 
for i in range(1, ne-1): 


out [i,0,:] = Apfi,:,:]@(-Q[i-1,N,:])+Am[i,:,:] 
@(-Q[i,0,:]) 
out [i,N,:] = Ap[i,:,:]@ Q[i,N,:] +Am[i,:,:] 


@Q[i+1,0,:] 
# Boundaries 
ae eee 


return out 


With this function definition, precalculated mass matrix M (and its inverse 
Minv), and stiffness matrix K, we can extrapolate the system for a given initial 
condition (we employ a Gaussian stress field QO at the first time step) using 


# [...] 
# Euler extrapolation scheme 
for it in range(nt): 
# Calculate Fluxes 
# Extrapolate each element using flux F 
Flux = flux(Q, N, ne, Ap, Am) 
# Loop through all elements 
for i in range(1,ne-1): 
Qnew[i,:,0] = dt*Minv @ \ 
( -mu[i]*K@Q[i,:,1].T-Flux[i,:,0].T)+Q[i,:,0].T 
Qnew[i,:,1] = dt*Minv @ \ 
(-1/rho [i] *K@Q[i,:,0].T-Flux[i,:,1].T)+Q[i,:,1].T 


where the stresses Qnew[i,:, 0] and velocities Qnew[?,:, 1] are extrapolated sep- 
arately. As was indicated in the scalar case the Euler implementation is useless 
from a practical point of view because of its dispersive properties. Therefore, we 
implement a predictor—corrector scheme as in the scalar case (see supplementary 
electronic material). The parameters for a homogeneous simulation are given in 
Table 9.2. 

The results are shown in Fig. 9.12. The initial Gaussian-shaped stress dis- 
tribution leads to stress and velocity waves propagating in both directions away 
from the source. Visually there is no difference between the analytical solution 
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Fig. 9.12 1D Elastic Case. Top: Stress 
waves propagating to the left and right 
from the initial distribution (dotted line). 
Bottom: Velocities propagating with re- 
versed polarity in opposite directions. In 
both graphs the analytical solution is su- 
perimposed. The difference is shown as a 
dashed line. 
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Table 9.2 Simulation parameters for 1D elastic case. 


Parameter Value Meaning 
ne 200 elements 
N 4 order 

Cc 2,500 m/s velocity 

p 2,500 kg/m? density 
Xmax 10,000 m x-domain 
AX min 8m increment 
dt 7x 107s time step 
eps 0.2 Courant 
Oo 200 m Gauss width 
xo 5,000 m source x 


and the numerical solution. Careful analysis of this scheme and a comparison 
with the other methods encountered so far is left as an exercise. From an algo- 
rithmic point of view it is important to note that for Lagrange polynomials of 
order N = 0 (constant function) we recover the classic finite-volume method as 
introduced in the previous chapter. 

Finally, we present a simulation of the discontinuous Galerkin method for 
a heterogeneous model and compare with a solution of the 1D velocity-—stress 
staggered-grid solution introduced in Chapter 4 on the finite-difference method. 
We use the same simulation set-up as presented in Table 9.2 except that the den- 
sity p is decreased by a factor of 4 in the right half of the model (x > 5,000 m). 
This leads to a velocity increase by a factor of 2. The stress initial condition 
is located at x = 4,000 m. The results are shown in Fig. 9.13. There is an 
excellent fit between the two numerical solutions (finite-difference method vs. 
discontinuous Galerkin method). Both methods capture well the behaviour at the 
discontinuity. 

In the introduction to this chapter we motivated the method by pointing out its 
flexibility with respect to complex geometries. Of course this cannot be exploited 
in 1D. However, interesting aspects to investigate are: (1) allowing A to vary 
at each collocation point (possible also with the spectral-element method), and 
(2) varying the polynomial order (i.e. p-adaptivity) in each element (not advisable 
for the spectral-element method in higher dimensions). These options can be 
explored with the supplementary electronic material. 


9.5 The road to 3D 


The key challenge for 2D or 3D implementations of the discontinuous Galerkin 
method is the fact that, in practice, you probably want to deal with unstruc- 
tured meshes. For regular meshes with planar element faces the implementation 
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is straightforward. For triangular, tetrahedral, or curved hexahedral meshes the 
difficulty lies in the implementation of the flux integrals over all boundaries. 

It is important to note that there are two different implementation schemes. 
The nodal approach is what he have used in this demonstration, following the 
concepts of spectral elements. In this case the solution field is evaluated at the 
collocation points. An alternative is the modal approach in which basis functions 
are used that do not have these interpolation properties. 

The modal approach was used in the 2D isotropic case by Kaser and Dumb- 
ser (2006), extended to the viscoelastic Kaser et al. (2007a) and anisotropic de la 
Puente et al. (2007) cases. Further articles with algorithms in 2D and 3D fol- 
lowing this approach are Dumbser et al. (2007a) and Dumbser et al. (20070). 
The latter paper introduced local time-stepping. In that case in principle each 
element can have its own time step. While a difficult task for load-balancing, 
this is an attractive feature to reduce overall computational costs. Further anal- 
ysis of the tetrahedral, modal implementation in 3D was given by Pelties et al. 
(2012). 

Etienne et al. (2010) introduced a 3D nodal scheme using Lagrange ba- 
sis functions on tetrahedral grids. Mazzieri et al. (2013) published a 3D 
nodal implementation. Both these algorithms were used to simulate earthquake 
scenarios. 

Because of its intrinsic local character, the natural allowance for arbitrary 
high-order schemes, and the possibility to vary order in space, there has been 
an increasing interest in the discontinuous Galerkin method for wave and rup- 
ture problems. This is partly due to the fact that in order to be able to use the 
rapidly increasing number of processors in supercomputers, scaling is a key is- 
sue. Further applications of the discontinuous Galerkin method and pointers to 
community codes are given in Part III and the Appendix. 
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Fig. 9.13 Simulation in a_heteroge- 
neous medium, comparison with the 
finite-difference method (bottom row). 
A Gaussian-shaped stress initial condi- 
tion (dotted line) is propagating in both 
directions. In the middle of the domain 
the velocity increases by a factor of 2 
leading to both transmission and reflec- 
tion. Top left: Discontinuous Galerkin 
simulation with order N = 4 (solid line), 
initial condition (dotted line) and an- 
alytical solution for homogeneous case 
(dashed line). Bottom left: Same as 
top left but with finite-difference method. 
Top right: Detail around interface, 
discontinuous Galerkin method. Bot- 
tom right: Same with finite-difference 
method. 
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Chapter summary 


The discontinuous Galerkin method is a finite-element-type method. The 
main difference to standard finite-element methods is that the solution 
fields are not continuous at the element boundaries. 


The elemental mass and stiffness matrices are formulated very similarly 
to how they are in the classic finite-element schemes. However, they are 
never assembled to a global system of equations. Therefore no large system 
matrix needs to be inverted. 


Elements are linked by a flux scheme, similar to the finite volume method. 
This scheme leads to an entirely local algorithm in the sense that all calcu- 
lations are carried out at an elemental level. Communication happens only 
to direct neighbours. 


The discontinuous Galerkin scheme can easily be extended to higher or- 
ders, keeping the local nature of the solution scheme. This leads to high 
efficiency when parallelizing. 


The solution fields can be expanded using nodal and modal approaches. 


The discontinuous behaviour at the element boundaries and the associated 
discretization of the element boundaries increases the number of degrees 
of freedom compared to other methods. 


The flexibility with polynomial order, element size, and local time step- 
ping leads to a formidable problem when parallelizing a discontinuous 
Galerkin method. Solution to this problem requires close cooperation with 
computational scientists. 


In seismology, the discontinuous Galerkin method is useful for prob- 
lems with highly complex geometries (by using tetrahedral meshes) and 
for problems with non-linear internal boundary conditions (e.g. dynamic 
rupture problems). 


FURTHER READING 


e The book by Hesthaven and Warburton (2008) provides an exten- 


sive discussion of the history of and motivation for the introduction 
of the Discontinuous Galerkin method. The focus is on the nodal 
version. 


Leveque (2002) discusses in detail the concept of fluxes in connection 
with the finite-volume method. The same flux schemes are used in the 
discontinuous Galerkin method. 


EXERCISES 


Comprehension questions 


(9.1) 


(9.2) 
(9.3) 


(9.4) 


(9.5) 
(9.6) 


(9.7) 


List the key points that led to the development of the discontinu- 
ous Galerkin method in seismology. Discuss the pros and cons of 
the method compared to finite-element-type methods and the finite- 
difference method. 

Explain qualitatively the difference between nodal and modal approaches. 
Explain why the discontinuous Galerkin method lends itself to parallel 
implementation on supercomputer hardware. 

What are p- and h-adaptivity? Why is it straightforward to have this adap- 
tivity with the discontinuous Galerkin method and not with others? Give 
examples in seismology where this adaptivity can be exploited and why. 
What is local time-stepping? For what classes of Earth models and/or 
problems in seismology might it be useful? 

What is the problem that arises on computers when using algorithms with 
h-/p-adaptivity and local time-stepping? 

Compare the spectral-element and the discontinuous Galerkin meth- 
ods as described in this volume. Point out their strong similarities 
and their differences. Based on this discussion, formulate domains of 
application. 


Theoretical problems 


(9.8) 
(9.9) 


(9.10) 


(9.11) 
(9.12) 


Show that the advection problem 0,q + ad,q = 0 has a hyperbolic form. 
The coupled 1D wave equation for longitudinal velocity v and pressure p 
can be formulated with compressibility K and density p as 


ap + Kav = 0 
(9.62) 


1 
0,u + —0,p = 0. 
p 


Formulate the coefficient matrix A of the coupled system of equations. 
Calculate its eigenvalues and eigenvectors. Compare with the solutions 
developed in this chapter for transversely polarized waves. 

Show that the rule of integration by parts corresponds to Gauss’s theorem 
in higher dimensions (assuming one of the functions under the integral 
to be unity). Explain the relevance of this for the discontinuous Galerkin 
method. 

Show that setting aw = 0 in Eq. 9.32 leads to the upwind flux scheme. 
Discuss the size of all matrices and vectors for the 1D solution presented 
in Eq. 9.47. 
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(9.13) 


Search in the literature for the classical four-term Runge-Kutta method. 
Formulate a pseudo-code for the scalar advection problem for this 
extrapolation scheme. 


Programming exercises 


(9.14) 


(9.15) 


(9.16) 


(9.17) 


(9.18) 


(9.19) 


(9.20) 


Apply the 1D discontinuous Galerkin solution for the scalar advection 
problem and find numerically the stability limit for the Euler scheme and 
the Lax—Wendroff scheme. Vary the polynomial order and investigate 
whether the stability limit changes. Compare with the stability behaviour 
of the finite-difference method for similar grid-point density. 

How discontinuous is the discontinuous Galerkin method? For the ex- 
ample problems given in the supplementary material, extract the field 
values at the element boundaries from the adjacent elements and calculate 
the relative amount of the field discontinuity. How do the discontinuities 
compare with the flux values? 

Formulate an upwind finite-difference scheme for the scalar advection 
problem and write a computer program. Discuss the diffusive be- 
haviour. Compare with the results of the scalar discontinuous Galerkin 
implementation. 

Modify the sample code such that each element can have its own polyno- 
mial order (p-adaptivity) and size (h-adaptivity). (Suggestion: Initialize 
the size of the solution matrices using the maximum number of degrees 
of freedom N? ,..). 

Extend the sample code for the scalar advection problem to the four- 
term Runge-Kutta method. Compare the accuracy of the method with 
the lower-order extrapolation schemes as a function of spatial order N 
inside the elements. 

Formulate the analytical solution to the advection problem (see Chapter 8 
on the finite-volume method) and plot it along with the numerical so- 
lution in each time you visualize during extrapolation. Formulate an 
error between analytical and numerical results. Analyse the solution er- 
ror as a function of propagation distance for the Euler scheme and the 
predictor—corrector scheme. 

Explore the p- and h-adaptivity of the discontinuous Galerkin method in 
the following way. Using an appropriate Gaussian function defined on the 
entire physical domain, decrease the element size by a factor of 5 towards 
the centre of the domain. Find an appropriate variation of the order inside 
the elements to obtain a reasonable computational scheme (in the sense 
that the grid-point distance does not vary too much). Hint: Use high- 
order schemes at the edges of the physical domain and low(est)-order 
schemes at the centre of the domain. 


Part Ill 


Applications 


Applications in Earth 
Sciences 


How can we make best use of the technologies described in this volume for real 
Earth science problems? Unfortunately, as indicated many times in the previous 
chapters, there is no single numerical method that works best in all situations. 
It is also important to note that the material we have covered here is really just 
scratching the surface. However, it should allow you to proceed with the next 
step, studying the literature which has solutions for problems in 3D, often with 
further improvements of the numerical schemes to make them more accurate. 

The goal of this chapter is (1) to provide some fundamental questions you 
should ask yourself before choosing a specific solver, and (2) to point to some 
milestone papers and recent examples where research questions were addressed 
with (2.5D or) 3D algorithms. It is important to note that the following sections 
are not reviews of the specific research fields. This is beyond the scope of this vol- 
ume. The stress is on recent applications of the methods discussed in the previous 
chapters, mostly in a 3D context. For more information on the developments in 
the research fields discussed, the reader is referred to the Further reading list at the 
end of Chapter 1. 

What are key issues for seismic wave-propagation simulation tasks? Certainly, 
the list below is non-exhaustive, but it should help you in getting started. 


e What is the geometry of your problem? Does it suffice to work with 
Cartesian coordinates? Do you have spherical or cylindrical geometry? 


¢ Can your problem be reduced to a 2.5D problem (substantially reducing 
the required resources)? 


@ What is your desired wavefield frequency range in relation to the size of 
your model? What is roughly the expected memory requirement for your 
Earth model and wavefields? 


e What is the source-receiver geometry of your problem? Can you make use 
of reciprocity (interchange sources and receivers) to reduce costs? 

e What rheology do you need in order to solve your problem (e.g. elastic, 
anisotropic, viscoelastic, poroelastic) ? 
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e Which wave types do you want to simulate (body waves, surface waves, 
complete wavefield)? How important is a highly accurate implementation 
of the free-surface boundary condition? 


e What is the level of geometric complexity of your Earth model (surface 
topography, internal boundaries such as faults or interfaces?) 


e What is the degree of heterogeneity in your geophysical parameters? How 
much do they change? Is the specific numerical method capable of coping 
with these parameter changes? 


¢ How would you describe the properties of your Earth model (smooth, 
layered, etc)? Do you need to honour internal interfaces? 


@ Can your problem be handled with regular (structured) meshes or do you 
require irregular (unstructured) meshes? 


@ Does the problem (or the solver available to you) require computational 
mesh generation? Are there meshes available for the specific region you 
want to model or do you have to generate them yourself? 


© Does your problem require parallel resources? Is your program adapted 
to these resources in terms of software (e.g. MPI implementation) or 
hardware (e.g. CPU or GPU, or both)? 


e What is your strategy to check the accuracy of your results? 


e What are the data volumes you will create? Can you (or your processing 
tools) handle these data volumes (data transfer, available disk space, avail- 
able core memory)? Alternatively, will you have to consider co-processing 
(i.e. analysing or visualizing results during runtime) ? 


© Does your problem require specific boundary conditions to be applied 
(e.g. absorbing boundaries, free-surface boundary) ? 


e Are you targeting dynamic rupture problems (special internal frictional 
boundaries apply to pre-defined fault surfaces) ? 


Once you have given (rough) answers to these questions, there might already be 
a tendency to favour one or the other numerical method or piece of community 
software available today. In the following sections we will show some examples 
from a variety of geo-scientific domains in which the numerical methods dis- 
cussed in this volume were used to solve scientific problems. The focus here is 
deliberately not on studies that present technical developments (such references 
are given in Part II) but on applications to science problems (in 2.5D or 3D) 
often involving comparison with observations. 

As routine (and sufficiently large numbers of) 3D simulations have only be- 
come possible in the past few years and continue to require sometimes substantial 
parallel resources, the number of relevant large-scale research projects is still quite 
small; but it is increasing rapidly. The examples given in what follows are by 


no means exhaustive and shall merely reflect the thought processes that go into 
choosing appropriate solution strategies.! 


10.1 Geophysical exploration 


Several of the applications of numerical methods to the seismic wave-propagation 
problem were pioneered and developed within the exploration domain. This is not 
surprising. Exploring the subsurface for resources and monitoring their extrac- 
tion over time requires the analysis of seismic waves. Without doubt the more 
realistic the forward modelling, the better the images are. Also, no other field has 
so much control over source-receiver geometries, and the current 3D marine or 
terrestrial experiments produce breathtaking data sets. Therefore, it is also fair 
to say that many of the technical developments and extensions today are done 
in the research labs of large geophysical exploration companies.? Thus, many of 
these developments are reported in the annual meetings of professional societies 
such as the European Association of Geoscientists and Engineers (EAGE) and 
the Society of Exploration Geophysicists (SEG) .? 

What is the character of exploration-type models? Let us list some of the most 
important features 


e The physical domains are such that the spherical nature of the Earth does 
not need to be taken into account. 


e You are simulating in limited domains. Thus, adequately performing 
absorbing boundaries are important. 


e For marine simulation scenarios, free-surface boundary conditions are triv- 
ial; however, you have to make sure the fluid—solid boundaries (sea bottom) 
are reflecting and transmitting correctly. 


e When comparing with observations, anisotropy and viscoelasticity have to 
be taken into account. 


e Depending on the target frequency range, the geometry of strong disconti- 
nuities might have to be honoured. 


e In most cases, body waves are the target seismic phases. In this case 
low-order implementations of the-free surface boundary condition might 
suffice. 


Exploration geophysicists developed benchmarking projects for forward mod- 
elling and inversion with which the emerging technologies could be tested. 
A famous example is the Marmoust model, that was introduced in the late eight- 
ies primarily to test migration schemes. The 2D model is shown in Fig. 10.1. 
There are some interesting features that indicate the challenges of modelling (and 
imaging) reservoir wave propagation. 

First, note that the P-wave velocities range from 1.5 km/s to 5.5 km/s. For an 
elastic model with corresponding S-velocities this means that wave velocities vary 
by almost an order of magnitude. From a computational point of view this implies 
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Fig. 10.1 Marmousi velocity model. 
A typical exploration-type seismic ve- 
locity model used to benchmark solvers 
and inverse problems. It contains many 
features that one finds in (marine) sed1- 
mentary structures. 


1 The authors of the projects I have 
missed out here may forgive me! 


? Here is a message for students. If you 
want a job in the exploration industry, 
do yourself a favour by studying and us- 
ing simulation technology. The exploration 
domain job profiles often refer to com- 
putational skills in particular. Having said 
that, many other applied branches out- 
side the geo-domain are making more and 
more use of simulation technologies. 


> Note that the level of research in the 
exploration field is well expressed by the 
expanded abstracts that are published with 
these meetings. It is not easy to find large- 
scale simulation problems with real data in 
the exploration domain published in jour- 
nals, though, as much of it is confidential. 
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Fig. 10.2 Waves around boreholes. Un- 
structured tetrahedral mesh of a cylin- 


drical borehole including a sensor inside 
the borehole and the medium around 
it. From Kaser et al. (2010). Reprinted 
with permission. 


that, while in general possible, the simulation of waves through such models with 
regular grids will necessarily lead to oversampling in parts of the model. Second, 
the model contains inclined layers of complex discontinuous shape. This raises 
the question of whether for such models meshes have to be generated that follow 
these interfaces and honour the shape of faults and discontinuities. As indicated in 
the introduction, this is the topic of ongoing research. Homogenization (see next 
chapter) would in principle allow replacing the discontinuous structures with a 
smooth version offering more flexibility concerning the mesh and the numerical 
method to be used. 

Another interesting feature in the Marmousi model is the presence of the 
high-velocity inclusion (white horizontal feature at the bottom of Fig. 10.1 mim- 
icking a salt body intrusion). There is a drastic velocity jump at the boundaries. 
It is not obvious that the numerical solvers can handle these jumps correctly 
(something that would need to be tested before performing final simulations). 
In terms of imaging, such reflectors are so dominant that they hide the structure 
underneath. 

The efforts to test and verify numerical modelling and imaging methods 
are currently continued in the SEG-SEAM project (SEG Advanced Modeling 
Corporation) with substantially more sophisticated test models; see Fehler and 
Keliher (2011) for an account of the first phase, and subsequent reports in the 
Leading Edge journal. 

Recommendations as to which numerical method works best in this situation 
are difficult. The most common approaches are the finite-difference method, and 
the finite/spectral-element methods. As indicated above, regular grid methods 
would require very fine meshes to accurately account for model complexities. On 
the other hand, honouring interfaces involves the generation of meshes, a process 
that can be very time consuming. 

Certain classes of exploration-type models are extremely difficult to construct 
with regular and/or hexahedral grids. An example is shown in Fig. 10.2. To under- 
stand the waveforms recorded with high-frequency sources and sensors around 
boreholes, the cylindrical borehole and the sensor inside have to be meshed. Tetra- 
hedral meshes are the method of choice for such complex geometries. Kaser et al. 
(2010) applied the discontinuous Galerkin method (SeisSol) using tetrahedral 
meshes to exploration problems. The overhead is the substantially higher com- 
putation time compared to regular meshes with similar mesh density (that would 
not be able to properly honour geometry). 

Recent examples of reservoir wave simulations using 3D finite-difference 
methods are: A tutorial on 3D acoustic wave propagation with reservoir applica- 
tions by Etgen and O’Brien (2007); Regone (2007) simulating wide-angle survey 
for subsalt imaging; Rusmanugroho and McMechan (2012) modelling a vertical 
seismic Profiling (VSP) experiment. Spectral-element simulations were presented 
by Boxberg et al. (2015) for porous media with application to COz monitoring, 
and Morency et al. (2011) for acoustic, elastic, and poroelastic simulations of 
COz2 sequestration and crosswell monitoring. 


General accounts of numerical wave propagation for reservoir problems with 
a variety of methods can be found in the excellent paper collection by Robertsson 
et al. (2012). 


10.2 Regional wave propagation 


Regional (or continental-scale) wave propagation stands for problems with O 
(1,000 km) dimensions. As nuclear tests have been banned* the source models 
are medium to large earthquakes (or other types of seismic sources such as ocean 
waves). Let us summarize the main requirements for regional wave propagation: 


e For 3D Earth models with dimensions >1,000 km, the spherical (or 
elliptical) shape has to be taken into account. 


e Seismic wavefields for these propagation distances are dominated by sur- 
face waves. Therefore, the accurate implementation of the free-surface 
boundary condition is important. 


e Structures near the Earth’s surface are characterized by low velocities (e.g. 
crustal velocities and/or ocean layers). This indicates the need to refine the 
computational meshes near the Earth’s surface. 


e For comparison with observations, the inclusion of viscoelastic and 
anisotropic effects is important. 


e Wave propagation in spherical sections can be formulated using either 
Cartesian or spherical coordinates (details below). 


e For regional wave-propagation problems the physical domain can often be 
limited to the crust and (upper) mantle. 


e The limitation of regional models implies that efficient absorbing bound- 
aries are important. 


The first algorithms for 3D regional wave propagation were presented by Igel 
(1999) using the Chebyshev pseudospectral approach and a formulation of the 
wave equation in spherical coordinates. Later a high-order staggered-grid finite- 
difference approach was applied to the same mathematical formulation (Igel et al., 
2002), with an application to wave propagation in subduction zones. 

Note the problem of discretizing a (regular) mesh in spherical coordinates: 
For models that include one of the poles the mesh elements become very small 
and in addition the equations are not defined at the axis 6 = 0. These prob- 
lems can be avoided by centring the mesh of a spherical section on the equator 
(i.e. rotating/shifting regions appropriately). Then the application of numerical 
methods to wave propagation in spherical coordinates has some beauty: For 
Chebyshev pseudospectral or finite-difference methods the implementation of 
free-surface boundary conditions is straightforward due to the orthogonality of 
local coordinates at the Earth’s surface. 
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4'The former Soviet Union has 
recorded so-called peaceful nuclear explo- 
sions (PNEs) on seismic profiles that were 
as long as 3,000 km. Analysis of these data 
were published in Fuchs (1997). 


274 Applications in Earth Sciences 


Fig. 10.3. Cubed-sphere chunk. Com- 
putational mesh for regional spectral- 
element simulation based on the cubed- 
sphere approach. In this case one chunk 
as extended into the mantle. The element 
size 1s adapted to the increasing veloci- 
ties at depth. (From specfem3d manual, 
Figure courtesy of M. Chen.) 


Fig. 10.4 Yetrahedral mesh parts as- 
sembled for regional wave propagation. 


The mesh was used for simulations with 
the discontinuous Galerkin method. The 
mesh density follows the seismic veloc- 
ity model keeping the number of points 
per wavelength approximately constant 
throughout the model. Figure courtesy of 
S. Wenk. 


An entirely different approach and a milestone for regional and global 
wave propagation was introduced through the cubed-sphere concept (Ronchi 
et al., 1996). Shortly after, the spectral-element method was adapted to wave- 
propagation problems such that the mass matrix was diagonal, allowing a fully 
explicit scheme, which is highly efficient for parallel implementations. These de- 
velopments paved the way for the combination of spectral elements with the 
cubed-sphere approach based on a formulation of the wave equation in Carte- 
sian coordinates and appropriate mapping of the elements to spherical geometry 
(Chaljub et al., 2003; Chaljub et al., 2007). For the problem of regional waves 
only one (or several) cubed-sphere chunk(s) are used (for an example see 
Fig. 10.3). 

To make up for the decreasing element size with depth for regular cubed- 
sphere meshes (while having at the same time increasing seismic velocities) the 
element size increases by an integer factor of two at several depth levels. This 
approach is implemented in the specfem3d software and plays an important 
role today for the application of full-waveform inversion on regional scales (see 
Section 10.5). 

At a time when full-waveform inversion using earthquake data on regional 
scales was around the corner Fichtner and Igel (2008) and Fichtner et al. (20096) 
presented an alternative, formulating a spectral-element method in spherical co- 
ordinates on a regular mesh. As this obviously leads to problems with accurately 
discretizing the Earth’s crust, a local homogenization scheme was introduced 
making sure that the surface waves are properly modelled. This leads to a 
very efficient scheme that was used for the first full-waveform inversion on a 
continental scale (Fichtner et al., 2009a) with an application to Australia. An- 
other spectral-element implementation specifically designed for regional wave 
propagation (regsem, see appendix) was presented by Cupillard et al. (2012). 

The need to adapt meshes to the low velocities near the Earth’s surface led 
to the application of the discontinuous Galerkin method to the problem of re- 
gional wave propagation (Wenk et al., 2013). An example is shown in Fig. 10.4. 
Mesh generation in this case is straightforward. However, because the implemen- 
tation is based on tetrahedra with planar faces (compared to curved hexahedra in 
the cubed-sphere approach), the convergence to the reference solutions is slow. 
In other words, extremely small tetrahedra have to be used and the high-order 
potential of the discontinuous Galerkin method (on large elements) cannot be 
exploited. A snapshot example is shown in Fig. 10.5. 

In summary, because of the strong requirement for high-order implemen- 
tation of the free-surface condition, the most widely used numerical method 
today for regional wave propagation is the spectral-element method with hexa- 
hedral meshes. Other alternatives that have not yet been fully explored include 
the discontinuous Galerkin method in nodal form on hexahedral grids. Given the 
current and future large-scale seismic array projects such as USArray or AlpAr- 
ray, efficient numerical tools for regional wave propagation will be important in 
the coming years. 


10.3 Global and planetary seismology 


Let us list the requirements for 3D global or planetary seismic wave propagation: 


e The shape of the body to be simulated is a complete sphere (or ellipsoid, or 
a deformed ellipsoid). 


e Planets are naturally limited areas. Therefore we do not need absorbing 
boundaries (hurray!). 


e The obvious way of describing physical problems on a sphere is by means 
of spherical coordinates. However, this leads to problems when applying 
numerical methods due to singularities. 


e Seismic velocities inside the Earth span more than an order of magni- 
tude. This indicates that the density of computational meshes should vary 
accordingly. 


© Global (teleseismic) seismograms are (mostly) dominated by surface waves. 
Thus accurate implementation of the free-surface boundary condition is 
crucial. 


e When comparing with observations, attenuation and anisotropy have to be 
taken into account. 


e Unless you are targeting very long periods, waves are likely to propagate 
many wavelengths. Therefore the numerical scheme has to be extremely 
accurate. 


e Atleast our planet has a substantial number of oceans. In general, their load 
(depth) and velocity structure have to be taken into account. 


e Earth’s crustal structure has a strong impact on seismic waveforms ob- 
served at the surface. Therefore, knowledge of this structure and proper 
implementation of it in a numerical scheme is crucial (still a hot topic 
today). 


It is interesting to note that one of the first applications of the finite-difference 
method to wave-propagation problems was targeting global wave propagation. 
The pioneering work by Alterman et al. (1970) was way ahead of its time, but due 
to the limitations of computational resources the simulations were only possible 
at very long periods. Parallel computing was established in the nineties and nu- 
merical schemes were developed in 2D and 3D Cartesian frameworks primarily 
for exploration problems or earthquake seismology. An obvious further domain 
of application was global seismology. However, because of the tremendously long 
propagation distances (in terms of number of wavelengths) the actual value of 
computational global wave simulation was questionable. 

By that time global seismology was dominated by quasi-analytical methods, 
such as normal-mode solutions that are exact for spherically symmetric Earth 
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Fig. 10.5 Regional wave propagation 
in Europe. Artistic view of seismic wave 
propagation based on the discontinu- 
ous Galerkin method. The edges of the 
tetrahedral grid are indicated by blue 
lines under the oceans. Figure courtesy of 
M. Meschede. 


Fig. 10.6 Axisymmetric modelling. Sn- 
apshot of global seismic wave propa- 
gation using a spectral-element method 
for the visco-elastic wave equation in 
cylindrical coordinates. Figure from van 
Driel et al. (2015a). 
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Fig. 10.7 Spectral-element mesh for ax- 
isymmetric case. Element size 1s adapted 
to accommodate varying seismic veloci- 
ties in the Earth’s interior. Figure from 
van Driel et al. (2015a). 


The cubed- 


Fig. 10.8 Cubed sphere. 
sphere concept (Ronchi, 1996) allows 
discretizing a spherical object with de- 


formed cubes. This opened the way for 
hexahedral-based spectral elements to 
be applied to global wave propagation. 
From Tsuboi et al. (2003). Reprinted 
with permission. 


> At the time I remember strong scep- 
ticism with the argument that the Earth is 
spherically symmetric to first order and the 
rest can be dealt with using perturbation 
methods. 


models and the incorporation of 3D effects using perturbation methods; see 
the excellent book by Dahlen and Tromp (1998). Following the early work by 
Alterman et al. (1970) an attractive computational scheme is the reduction of the 
complete spherical domain to a hemisphere, assuming that all fields are invariant 
along lines of constant latitude. This so-called axisymmetric (or zonal) approach 
corresponds to a 2.5D scheme as described in Section 3.3.1. 

First realizations of this approach using high-order staggered-grid finite- 
difference approximations to the wave equation in spherical coordinates were 
presented by Igel and Weber (1995), Igel and Weber (1996), and Chaljub and 
Tarantola (1997).> An example is illustrated in Fig. 10.6. It is important to note 
that the wave equation in spherical coordinates is singular at the axis 6 = 0 (just 
check the Laplace operator in spherical coordinates) and the implementation of 
general seismic sources is tricky. 

However, while limited concerning direct comparison of synthetic seismo- 
grams with observations, axisymmetric methods proved useful to estimate effects 
of laterally heterogeneous structures in the mantle (Igel and Weber, 1996; Igel 
and Gudmundsson, 1997; Jahnke et al., 2008; Thorne et al., 20130). 

A major boost to this 2.5D approach came with the option to use arbi- 
trary seismic sources (Toyokuni and ‘Takenaka, 2006) with the finite-difference 
method or the spectral-element method on a mesh with depth-dependent el- 
ement size (Fig. 10.7). Nissen-Meyer et al. (2007) introduced a scheme by 
which seismograms for arbitrary moment tensors could be obtained by summa- 
tion. This approach was recently extended, allowing extremely fast calculation 
of high-frequency synthetic seismograms using the superposition principle and 
pre-calculated Green’s functions (van Driel et al., 20156). 

What about synthetic seismograms for an entire sphere? While it may appear 
natural to stick to spherical coordinates, this is a no-go! The reason is, as indi- 
cated earlier, that the wave equation in spherical coordinates has singularities and 
regular discretization of spherical coordinates leads to decreasing grid spacing 
near the poles (certainly familiar from looking at a globe). From a stability point 
of view this is not acceptable. A solution to the problem came with the work 
of Ronchi et al. (1996) who introduced the cubed-sphere concept as already 
mentioned in Section 10.2. 

A spherical object can be discretized by deforming cubes such that they fill an 
entire sphere (see Fig. 10.8). This opens the path to applying numerical methods 
like spectral elements to the problem of global wave propagation. The flexibility of 
Galerkin-type methods perfectly matches the requirement to describe the wave- 
field on curved hexahedral elements that make up an entire Earth model. The 
original work in the dissertation of Chaljub (2000) and Chaljub et al. (2003) was 
taken up and extended by Komatitsch and Tromp (2002a) and Komatitsch and 
Tromp (20028). A special formulation for the coupling of fluid and solid parts of 
the Earth’s interior was presented by (Chaljub et al., 2007). The subsequently de- 
veloped community code (specfem3dglobe) is in my view one of the great success 
stories of computational seismology. 


This solver allowed for the first time the exact (to possible accuracy) cal- 
culation of waves through 3D global Earth models, including ellipticity, anelas- 
ticity, surface topography, and anisotropy. The open-source philosophy and the 
distribution via the NSF-funded CIG project® allowed extensions and improve- 
ments by the seismological community. Today this approach is the method of 
choice for the simulation of global or planetary wave propagation. 

A further milestone was the publication of synthetic seismograms for the M7.9 
2002 Denali fault earthquake calculated on the Earth Simulator in Japan, a super- 
computer that was installed in response to the devastating M7.2 Kobe earthquake 
in Japan in 1995. The comparison between observation and theory presented 
by Tsuboi et al. (2003) (see Fig. 10.9) illustrated the state of knowledge of 
the 3D structure of the Earth’s interior (at the frequency range possible at the 
time). 

It was the starting point for full wavefield modelling of global wave prop- 
agation, with the ultimate goal of applying adjoint inversion techniques on a 
global scale. Tromp et al. (2010) started an initiative by which synthetic seismo- 
grams and wavefield animations are automatically calculated after each sizeable 
earthquake. Further applications include the study of seismic waves following a 
meteorite impact (Meschede et al., 2011). An excellent review of the spectral- 
element method for both forward and inverse functionalities is presented in Peter 
et al. (2011). 

There is a tremendous amount of research related to global (and planetary) 
wave propagation using the specfem3dglobe code and to list it all is impossible. 
Examples are 3D mantle structure forward modelling experiments (Schuberth 
et al., 2009; Schuberth et al., 2012; Schuberth et al., 2015), studies exploring the 
sensitivities to certain parameter classes (Sieminski et al., 2009; Sieminski et al., 
2009), and improving global earthquake parameters using 3D planetary-scale 
simulations (Lentas et al., 2013). 

An alternative approach was presented by Capdeville et al. (2003) who com- 
bined a spectral-element formulation in the Earth’s mantle with a normal-mode 
solution in the core. This approach is particularly useful for applications with 
focus on mantle waves. 

To my knowledge the only alternative to the spectral-element method for 
global wave propagation was presented by Wilcox et al. (2010), who introduced 
a nodal discontinuous Galerkin approach for a spherical mesh but only for sim- 
plified Earth models. An example of a tetrahedral mesh adapted to the velocity 
structure inside the Earth is shown in Fig. 10.10. While it is possible to use 
such meshes for global wave propagation in combination with the discontinuous 
Galerkin method, the computational effort far exceeds the one required when 
using spectral elements. 

Computational global wave propagation is dominated by the spectral-element 
method both in the 3D case (cubed sphere) and in the axisymmetric case (where 
finite-difference approximations still have some attraction due to their simple 
algorithms). 
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Fig. 10.9 3D global simulation. Ob- 
served (black) and synthetic (red) trans- 
verse component seismograms for the 
M7.9 2002 Denalh fault earthquake cal- 
culated on the Earth Simulator using 
the spectral-element method. The level of 
fit between observations (black) and sim- 
ulation (red) reflects the knowledge about 
global 3D structure. From Tsuboi et al. 
(2003). Reprinted with permission. 


® CIG stands for computational infras- 
tructure in geodynamics (<http://www. 
geodynamics.org>) and provides access to 
a number of geophysical simulation codes. 
See Appendix. 
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Fig. 10.10 Global unstructured grid. 
Tetrahedral grids offer flexible adapta- 
tion to seismic velocity structures. How- 
ever, computational requirements exceed 
those of alternative schemes. Figure cour- 
tesy of S. Wenk. 


10.4 Strong ground motion and dynamic 
rupture 


Simulation codes for strong ground motion and dynamic rupture simulation 
in 3D have similar requirements, which is the reason why they are presented 
here in one section. In fact, many algorithms were first developed for modelling 
wavefields due to kinematic (i.e. predefined) sources and later extended to allow 
spontaneous (dynamic) rupture at predefined fault surfaces. 

In the absence of any hope of predicting earthquakes in a deterministic way, 
the calculation of seismic wavefields for realistic earthquake scenarios has become 
one of the most important tasks in seismology and earthquake engineering. The 
term strong ground motion in this context takes the meaning that the earthquakes 
to be studied (and the associated ground motion) are sizeable (i.e. potentially lead 
to damage), implying usually that the region to be studied is not too far away from 
the source. Strong earthquake ground motions may well exceed 1g(* 10 m/s”) 
which would saturate standard broadband velocity sensors. Therefore, strong 
motion networks are predominantly based on accelerometers. 

The scientific importance of earthquake simulations and their substantial so- 
cietal relevance led to early applications of numerical methods to this problem, 
a vast number of publications, and continuous developments that are still ongo- 
ing. Therefore, only a few aspects of this field can be covered. An example of 
a 3D earthquake simulation involving a sedimentary basin structure is shown in 
Fig. 10.11. 

Let us list the key features relevant for earthquake scenario simulations: 


¢ Realistic earthquake scenario simulations only make sense in 3D, as strong 
ground motion is strongly affected by 3D structure, rupture behaviour, and 
radiation patterns. 


e Except for mega-earthquakes (M9), a Cartesian framework is usually 
sufficient (e.g. Los Angeles Basin, San Francisco Basin). 


e Accurate earthquake scenario calculations that can be compared with 
observations necessitate good structural subsurface models. 


e ‘To make civil engineers happy (i.e. to simulate wavefields relevant for 
structural damage) you need to achieve frequencies well above 1 Hz 


e The necessity for high frequencies in turn implies good knowledge of the 
near-surface structure. (Difficult! One reason why stochastic earthquake 
simulations are so popular). 


e Large earthquakes happen on finite faults. These have to be initialized using 
the superposition principle. 

e Earthquake faults may be of complex shape. 

e Rupture behaviour can be predefined (kinematic) or the result of a pro- 


cess initiated on a predefined fault with unknown outcome (dynamic or 
spontaneous rupture). 
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e In many cases the accurate modelling of complicated 3D sedimentary (i.e. 
low-velocity) structures is important. 


e Strong ground motion simulations may require the incorporation of non- 
linear rheologies (e.g. plastic behaviour, damage rheology). 


e Because of the high-frequency requirements, earthquake scenario calcula- 
tions to date are computationally very demanding. 


e ‘To obtain a reasonable picture about shaking hazard in a seismically active 
region, a large number of 3D simulations might be necessary, accounting 
for uncertainties in 3D structure and earthquake characteristics. 


e Recent extensions are the incorporation of structures (bridges, buildings) 
in the modelling (soil-structure interaction). 


Before listing some of the key studies it is important to distinguish kinematic 
and dynamic (rupture) simulations. When the space-time behaviour (also called 
slip history) on a fault is predefined, then one speaks of a kinematic rupture. A 
more physical approach is to define a fault plane, introducing a frictional bound- 
ary condition that determines what happens to a fault patch when it breaks, and 
then initialize an earthquake at the desired hypocentre. This is called dynamic or 
spontaneous rupture modelling. At first, dynamic rupture simulations were per- 
formed primarily to understand the rupture itself. Only recently have they been 
used in connection with strong ground motion simulations and to model real 
observations. 


Strong ground motion simulations 


Early kinematic 3D calculations using a staggered-grid finite-difference method 
were presented by Graves (1993) and Graves (1995). The group of Jacobo Bielak 
applied the classic finite-element method with implementation on parallel hard- 
ware (Li et al., 1994) and many applications to strong ground motion problems 
(e.g. Bielak and Xu, 1999; Bielak et al., 1998). Bielak et al. (2005) applied the 
octree approach allowing local mesh refinement to regions with low velocities. 
Ground motion calculations can be used to extract for so-called shaking haz- 
ard maps quantifying the maximum expected shaking at the Earth’s surface (an 
example is shown in Fig. 10.12. 

Wang et al. (2008) used the finite-difference method to calculate a database 
with subfault Green’s functions allowing subsequent synthesis of arbitrary slip 
histories and associated ground motion scenarios. Yin et al. (2011) simulated 
earthquakes with the finite-element method (GeoFEM) specifically developed for 
the Earth Simulator in Japan at the time. 

Recent applications of the spectral-element method to kinematic rupture prob- 
lems include simulations of the Aquila earthquake (Smerzini and Villani, 2012; 
Magnoni et al., 2013) and earthquakes in Taiwan (Lee et al., 2009) incorporat- 
ing and discussing effects of strong topography. Subduction earthquakes of the 
Cascadian area were simulated by Delorey et al. (2014) using a finite-difference 
approach. Kinematic rupture simulations were carried out with the discontinuous 


Fig. 10.11 Earthquake scenario calcu- 
lations. Surface snapshots of the hori- 
zontal component of ground velocity for 
an earthquake happening in the Cologne 
basin, Germany. The map corresponds 
toa 120 km x 120 km region. The epi- 
centre 1s outside the sedimentary basin 
denoted by (depth) contour lines. The 
wavefield is trapped inside the basin, 
leading to prolonged shaking compared 
to bedrock sites. From Ewald et al. 
(2006). 
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Fig. 10.12 Shaking hazard maps. 
Mercalli intensity derived from sim- 
ulations of earthquakes in the region 
discussed in Fig. 10.11. The sedi- 
mentary basin (contour lines) leads 
to increased shaking amplitudes and 
therefore higher Mercalli intensity. From 
Ewald et al. (2006). 


7 In my view the organizers of these 
projects are heroes. It is extremely hard 
to motivate groups to participate as get- 
ting benchmarks right takes time. Also, 
researchers involved do not necessarily get 
the credit they deserve. However, these 
verification exercises are important steps 
towards credible and reproducible science 
in our field. 


Galerkin approach (Kaser et al., 20076; Késer and Gallovic, 2008; Gallovic et al., 
2010). An example is shown in Fig. 10.13. Mazzieri et al. (2013) developed 
an open-source nodal discontinuous Galerkin scheme (SPEED) for multi-scale 
problems that was subsequently applied to earthquake simulation problems. Fur- 
ther milestones include the work of Moczo et al. (20106), Moczo et al. (2011), 
and Maufroy et al. (2015) comparing various methods with respect to their ca- 
pability of modeling high v,/v, ratios. An extensive recent review of the state of 
the art of earthquake simulation including soil-structure interaction is given by 
Paolucci et al. (2014) and in the book by Moczo et al. (2014). 


Dynamic rupture simulation 


Following the pioneering work on 2D rupture problems using the finite-difference 
method by Andrews (1973), Andrews (1976a), Andrews (19766), Madariaga 
(1976), the applications were extended to 3D by Day and Boatwright (1982) and 
Day (1982). The potential of this approach to understand strong ground mo- 
tion observations and to quantify shaking hazard was recognized by Kim Olsen 
and co-workers (Olsen et al., 1995; Olsen and Archuleta, 1996; Olsen et al., 
1997; Madariaga et al., 1998), leading to many realistic large-scale parallel earth- 
quake scenario simulations (e.g. Olsen et al. (2008) within the framework of the 
Terashake project). This line of research continues with recent efforts to match 
simulations with geological observations (precariously balanced rocks, Lozos 
et al., 2015). The impact of various dynamic rupture scenarios for a bi-material 
interface was investigated by Brietzke et al. (2009) using a 3D finite-difference 
method. A recent hybrid approach combining boundary-element methods (rup- 
ture) with a 3D finite-difference method (wave propagation) was presented in 
Aochi and Ulrich (2015). 

The spectral-element method is also widely used for dynamic rupture simula- 
tions (Festa and Vilotte, 2006; Kaneko et al., 2008; Kaneko et al., 2011; Galvez 
et al., 2014). Due to the specific description of wavefields in the discontinuous 
Galerkin method the application to rupture problems (with their natural discon- 
tinuous behaviour across the fault plane) seemed obvious. The 2D version (de la 
Puente et al., 2009a) was soon after extended to 3D (Pelties et al., 2014; Pelties 
et al., 2015). An attractive feature of this approach is the possibility of complex 
fault shapes (see Fig. 10.14). At the present time it appears that the discontinuous 
Galerkin method has considerable advantages over other numerical approaches in 
dealing with the nonlinear frictional boundary conditions. 

Community projects have been set up that aim at comparing various numerical 
techniques for given 3D structures, as well as kinematic and/or dynamic rupture 
specifications. These projects are extremely important and should be supported.’ 

Recent examples are the finite-source scenarios for complex basins ((Chaljub 
et al., 20106; Chaljub et al., 2015; Maufroy et al., 2015)), and the Euroseistest 
project (Chaljub et al., 2010a). Other initiatives are the Shakeout project (Bielak 
et al., 2010) and the SCEC-USGS rupture dynamic code comparison exercise 


Seismic tomography—waveform inversion 


(Harris et al., 2010). Let me finish by quoting the conlusions of one of these 
validation exercises (Chaljub et al., 20106): 


The main recommendation to obtain reliable numerical predictions of 
earthquake ground motion is to use at least two different but comparably 
accurate methods, for instance the present formulations and implementations 
of the finite-difference method, the spectral-element method, and the arbitrary 
high-order (ADER) discontinuous Galerkin method. 


That says it all. In any case, I highly recommend careful study of the publica- 
tions of these validation exercises before choosing a solution strategy for a specific 
problem. 


10.5 Seismic tomography—waveform 
inversion 


The imaging of Earth’s interior structure using seismic data is so fundamental 
that it deserves some special attention. It is tightly linked to the history of com- 
putational seismology and today one of the most active and expanding fields in 
seismology. For decades, the Earth’s interior structure was mapped using (more 
or less only) travel time information and ray-theoretical concepts for the forward 
problem. In the past few decades this has dramatically changed. With increas- 
ing computational power the goal is now to calculate entire waveforms for 3D 
models (e.g. using the methods described in this volume) and match them with 
observations as well as possible. What do we really mean by waveform inversion? 

Let us start with an example. Fig. 10.15 illustrates seismic observations at 
station OUZ in New Zealand following an earthquake in Indonesia (solid black 
line). Synthetic seismograms for some initial 3D Earth structure (dashed line) do 
not match the observations well. Following an iterative scheme that progressively 
alters the Earth model thereby minimizing the misfit between theory and obser- 
vations leads to the red line. The final fit is much better. This procedure is called 
waveform inversion or waveform fitting. In simple terms, the goal is to minimize a 
discrete functional like: 


misfit = x > a 


sources receivers Components 


|| d— g(m) ||. (10.1) 


Here d is a vector containing seismogram samples of one motion component at a 
receiver location. g(m) denotes the forward problem, that is, the solution of the 
wave equation for Earth model m calculated with the same sampling interval. The 
double bars denote the mathematical norm (e.g. the least-squares norm £2) defin- 
ing distances in model spaces. Other comparative measures like cross-correlations 
(time shifts, coefficients) are also possible. 
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Time: 13.6 sec 


Fig. 10.13 Grenoble simula- 
tion using the discontinuous Galerkin 
method. Top: Snapshot of a finite- 


source earthquake originating on a fault 


valley 


(red line) outside a sedimentary basin. 
The wavefield continues to vibrate once 
the body waves have passed. Bottom: 
Tetrahedral meshes that make up the 3D 
model. Figures courtesy of M. Kaiser. 
Reprinted with permission. 
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Fig. 10.14 Fault system mesh. Trian- 
gular mesh of the Landers fault system. 
The mesh is severely densified towards 
the fault to allow accurate calculations of 
the rupture process. From de la Puente 
et al. (2009a). 


Fig. 10.15 Full waveform inversion. 
The improvement of iterative adjoint 
waveform misfit. The graph shows ob- 
served data (solid black), synthetic seis- 
mograms for initial 3D model (dashed 
black), and final 3D model (red line). 
Figure courtesy of A. Fichtner. 


There are many inverse approaches that work with the waveform-fitting con- 
cept; from the pioneering work on surface wave inversion (Woodhouse and 
Dziewonski, 1984) using a ray-based approach to the recent finite-frequency 
waveform-matching approach using cross-correlation measurements of phase 
misfit, for example Sigloch et al. (2008), Tian et al. (2011). Perhaps the most 
holistic and most computationally expensive approach is the use of full 3D mod- 
elling schemes for forward (and usually also for inverse) calculations. In the 
following we will focus on these types of applications using numerical schemes 
as discussed in this volume. As before let us summarize some aspects of wave 
simulations in the context of waveform inversion: 


e Formal waveform-fitting algorithms require many forward problems to be 
solved. 3D simulations are expensive anyway so this means potentially 
severe restrictions concerning the frequency range. 


e The target is the modelling of real observations. Therefore, viscoelasticity 
and in many cases anisotropy have to be taken into account. 


e Waveform inversion can be done at local (Cartesian), regional, or global 
scale (spherical geometry). 


e Waveform-fitting solutions (nonlinear or linearized) can be built around 
standard forward simulation algorithms. 


e Structural inversion requires good knowledge of source properties (e.g. 
source time function in exploration-type problems and source depth, 
moment tensors for earthquake data). 


e Because of the large computational requirements, numerical solutions are 
preferred that do not require re-meshing with Earth model updates. 


Ideally, we would like to explore all (or many) possible Earth models, compare 
synthetic seismograms with observations, and find the best one, or, even better, a 
collection of models that fit the data well. 

Unfortunately, it is not that easy. Because the forward problem is so expensive, 
the way forward is to start with a good guess (zitzal model) that hopefully is close 
enough to the final solution such that the solution can be found by gradient search 
(akin to the Newton algorithm). 


North 


48.34° 
ae ae 
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This approach was pioneered for the acoustic and elastic wave equations by 
Bamberger et al. (1982), Tarantola (1986), and Tarantola (1988). Tarantola and 
Valette (1982) and Tarantola (2005) embedded these concepts in a probabilistic 
inversion framework (Fig. 10.16.). 

The theoretical concepts for waveform inversion led to a hype in exploration 
geophysics with industrial consortia (e.g. the G'G Group in Paris, see Fig. 10.16, 
or the Stanford Exploration Project SEP) heavily promoting research and the 
development of (parallel) inversion software (e.g. Crase et al., 1990; Igel et al., 
1996). But these applications, due to computational restrictions, were initially at 
most in 2D with limited applicability to the emerging 3D acquisition geometries. 

Therefore, the hype came to a halt in the nineties, when 3D solutions were 
impossible. While preparing this volume, I wondered whether this intuitive notion 
can be supported by data, and I think it can (see Fig. 10.17). The number of 
publications on full waveform inversion increased quite slowly in the nineties until 
around 2005, when computational power had evolved such that 3D inversion 
seemed possible. Activity in this field then exploded, and the rate since then has 
not relented, as can be seen in the figure (the data show exploration problems 
only; the same holds for earthquake seismology). 

In seismic exploration, alternative strategies based on numerical solutions in 
the frequency domain (Pratt et al., 1998) led to a number of applications in 2D 
(e.g. Bleibinhaus et al., 2007; Bleibinhaus et al., 2007) and in 3D (Ben Hadj Ali 
et al., 2009a, Ben Hadj Ali et al., 20096, Krebs et al., 2009). The latter studies 
were based on the source encoding concept exploiting the superposition principle. 
Excellent reviews on forward and inverse modelling methodologies can be found 
in Virieux and Operto (2009) and Virieux et al. (2009). 
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Fig. 10.16 Albert Tarantola (1949- 
2009). With his work on seismic wave- 
form inversion, the probabilistic formu- 
lation of inverse problems, his vision of 
the role of computations, the leadership 
of the Geophysical Tomography Group 
in Paris, and his vibrant personality, he 
had a strong impact on computational 
seismology. Picture shows A.T: as ref- 
eree during the soccer match at the 2006 
SPICE workshop in Kinsale, Ireland. 


Fig. 10.17 Publication history of wave- 
form inversion. The number of publi- 
cations in the field of seismic waveform 
inversion published by the Society of Ex- 
ploration Geophysicists (SEG) including 
extended abstracts, special volumes, jour- 
nals, and books. 
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Fig. 10.18 Full waveform inversion. 
Slice at depth 234 m through the model 
update of an acoustic full waveform in- 
version. The acoustic Valhall ocean cable 
data set has 2,300 receivers and 50,000 
shots were fired. The image shows 
acoustic velocities (dark—low, bright— 
high). The white bands are paleo-rivers 
now embedded in the sediments. From 
Schiemenz and Igel (2013). 


8 Let us be honest: we have always 
envied the medical tomographers for 
their controlled source-receiver geome- 
tries, their straight ray paths, and the fasci- 
nating 3D images they obtain. But the gap 
is narrowing! 


A fundamental milestone in applied waveform inversion is the work of Sirgue 
et al. (2010) and Sirgue et al. (2012) showing inversion results for the Valhall 
data set (3D acquisition geometry with 2,300 sea-bottom receivers and 50,000 
sources) of a resolution that is reminiscent of medical tomography.® Further ap- 
plications on this data set were carried out by Etienne et al. (2012) and Schiemenz 
and Igel (2013) using spectral elements (an example of a model gradient is shown 
in Fig. 10.18, indicating the high resolution potential). A strategy based on the 
finite-difference method was presented by Butzer et al. (2013) with an application 
to small-scale heterogeneities. 

What about earthquake seismology? For a long time this field was dominated by 
ray-based inversion methods. It is remarkable how much we have learned about 
the structure of Earth’s interior, enabling us to reduce a seismogram to a few 
bytes of information and throw away the rest. It is fair to say that a breakthrough 
for the application of waveform inversion tools to earthquake seismology came 
through the adoption of alternative misfit measures. Luo and Schuster (1991) 
suggested the use of cross-correlation functions as a misfit measure, allowing 
phase information to be extracted from waveform data. 

In the seminal paper by Tromp et al. (2005) the concepts of adjoints, 
time reversal, and banana-doughnut kernels are merged, providing a frame- 
work for waveform inversion of earthquake data (for source and structure). 
Fichtner et al. (2008) introduced an alternative time-frequency domain mis- 
fit criterion based on the work of Kristekova et al. (2006). An inversion 
scheme using the finite-element method was presented by Askan and Bielak 
(2008). 

These theoretical developments laid the foundations for some of the first appli- 
cations to earthquake data. Chen et al. (2007) used the finite-difference method to 
invert for the velocity structure of the Los Angeles Basin. Tape et al. (2009) and 
Tape et al. (2010) applied the spectral-element method and an automated way 
of finding appropriate time-windows for matching synthetics with data (Maggi 
et al., 2009). 

The first application on a regional (continental) scale was presented by 
Fichtner et al. (2009a) and Fichtner et al. (2010). They used a regular-grid 
spectral-element method in spherical coordinates to image the structure beneath 
the continent of Australia, providing proof that, despite the potential uncertain- 
ties in source parameters, waveform inversion on this scale is possible. By now, 
waveform inversion approaches are applied to all regions where station density 
and data quality is sufficient. Examples are the European continent (Zhu et al., 
2012), the South Atlantic (Colli et al., 2013), North Anatolia (Fichtner et al., 
2013a), the western Mediterranean region (Fichtner and Villasenor, 2015), and 
several regions in Asia (Chen et al., 2015a; Chen et al., 20150). An illustration 
of continental-scale imaging is shown in Fig. 10.19. The emergence of waveform 
inversion-based images at various scales all over the place raises the question of 
how these models can be merged, combined, and reused. This is the topic of 
recent research (Fichtner et al., 20130). 


An exciting current research front is the application of full waveform inversion 
for the whole planet using 3D simulations. As computational resources increase 
and our knowledge of Earth’s interior structure improves, these techniques are 
paving the way to explaining more and more energy in the observed global seis- 
mic wavefield. Recent results based on spectral-element methods were presented 
in French and Romanowicz (2015). Work in progress is shown in Fig. 10.20 us- 
ing the global version of specfem for the wave simulations. Capdeville et al. (2005) 
investigated theoretically the potential of source stacking for global waveform 
inversion, but with limited success. 

While these studies indicate a major advance in matching theory with ob- 
servations, we are still not doing very well at quantifying uncertainties in the 
resulting tomographic models. Despite recent developments (Fichtner and Tram- 
pert, 20110) it appears that only a substantial level of random search allows proper 
quantification of model uncertainties. An attempt was made by Kaufl et al. (2013) 
to try to test structural hypotheses with reduced dimensionality using Monte 
Carlo techniques (Sambridge, 1999a; Sambridge, 1999b). 

In summary, full waveform inversion is an extremely active field. It is likely 
that in a few years it will become the standard technology to invert for struc- 
ture on all scales. From experience, inverting real data is always hard. In terms 
of numerical techniques, finite-difference and spectral-element methods with 
regular or general hexahedral meshes have been used extensively. The compu- 
tationally more expensive methods based on tetrahedral meshes provide little 
advantage. 

Excellent presentations on the theory behind full waveform inversion can be 
found in Tromp et al. (2005), Peter et al. (2011), and Fichtner (2010), and ref- 
erences therein. A recent open-source Python-based framework (LASIP) for the 
entire inversion workflow from data recovery to final model was presented by 
Krischer et al. (2015a). A full waveform inversion implementation using spectral 
elements on GPUs can be found in Gokhberg and Fichtner (2015). 


10.6 Volcanology 


The seismic monitoring of an active volcano is key to understanding the state 
of its eruptive system. By their nature, volcanoes are usually difficult to ac- 
cess (see Fig. 10.21), hard to instrument, and observations of ground motions 
are extremely complex.’ Let’s summarize some requirements to model wave 
propagation in a volcano: 


e Rough topography will influence the wavefield and has to be taken into 
account. 


e Mesh generation based on digital elevation models (DEMs) may be 
necessary. 
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Fig. 10.19 Regional waveform inver- 
sion under Europe. Seismic velocity 
model at 100 km depth (red—low veloc- 
ities, blue—high velocities) obtained by 
full waveform inversion. Figure courtesy 
of A. Fichtner. 


Fig. 10.20 Global waveform inversion. 
Velocity perturbations in Pacific re- 
gion after 15 iterations (red—low ve- 
locities, blue—high velocities), obtained 
by full waveform inversion using the 
spectral-element method. Figure courtesy 
of E. Bozdag. 


° Some say a volcano is a seismolo- 
gist’s nightmare—unknown internal struc- 
ture, strongly scattering, rough topogra- 
phy, etc. But of course it is fun to face these 
multiple challenges. 
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Fig. 10.21 Mount Fuji, Fapan. Most 
(active) volcanoes are characterized by 
strong topography and severely scatter- 


ing internal structures. Photo courtesy of 
M. Goll. 


Fig. 10.22 Volcano Merapt, Indonesia. 
Snapshot of waves generated by a source 


near the summit of the volcano. Wave- 
field was simulated with a discontinuous 
Galerkin method on a tetrahedral grid. 
Figure courtesy of A. Breuer. 


e The internal structures are often highly uncertain. Observations suggest 
strongly scattering material that might have to be modelled with a random 
media approach. 


e There are many sources of seismic energy (volcano-tectonic events, 
tremor-like signals, rock falls, dome collapes, bubble explosions, etc.), some 
of which might be difficult to describe with standard moment tensors. 


e Some seismic signals (e.g. bubble explosions) may require the modelling of 
interaction between atmosphere and solid volcano edifice. 


Because the problem of seismic wave propagation in volcanoes is a truly 3D prob- 
lem, the application of numerical methods to this problem started only when 
resources were sufficient to allow 3D calculations. At an early stage, methods 
allowing flexible mesh geometry at the surface were not available. Therefore, 
attempts were made to adapt regular-grid finite-difference methods to allow 
complicated surfaces. Ohminato and Chouet (1997) introduced a method with 
which regular finite-difference blocks with appropriate boundary conditions were 
adapted to real volcano topographies. 

While it was in principle possible to use the approach to understand the topo- 
graphic effects of volcanoes (e.g. Ripperger et al., 2003), it required an extremely 
large number of grid points per wavelength to get the waveforms right. An ele- 
gant alternative with an effort similar to finite-difference methods was presented 
by O’Brien and Bean (2004) and O’Brien and Bean (2009), who extended earlier 
work (Toomey and Bean, 2000) in which a method based on a discrete par- 
ticle scheme for seismic wave propagation was introduced. This particle-based 
methodology was extensively used to study wave propagation and inverse prob- 
lems on volcanoes (Lokmer et al., 2007; Davi et al., 2010; Meétaxian et al., 
2009). 

The spectral-element method does allow sufficient flexibility with hexahedral 
meshes to model complex tomography. This approach was adopted by van Driel 
et al. (2012), Kremers et al. (2013), and van Driel et al. (2015c) who investigated 
strain-rotation coupling, moment tensor inversion, and tilt effects on moment 
tensor inversion in models including realistic volcano topography. Further recent 
examples can be found in Kim et al. (2014) who studied infrasound signals at 
volcanoes using the finite-difference method. 

Last, but not least, the discontinuous Galerkin method on unstructured tetra- 
hedral meshes lends itself to problems like wave propagation inside volcanoes. 
The meshing based in arbitrary topography models is straightforward. The mod- 
elling of wave propagation inside the Indonesian Merapi volcano was the chosen 
test case when the SezsSol code exceeded 1 PFlop performance (Breuer et al., 
2014), see Fig. 10.22. 

I am convinced that the simulation of seismic wave propagation inside vol- 
canoes will continue to develop, and become a standard procedure for volcano 
monitoring. Methods allowing complex geometries (e.g. Galerkin-type methods, 


finite-volume methods) will play a more important role than finite-difference 
methods. However, because of the large computational requirements and the 
necessity to develop high-quality meshes, it will take time to make these tools 


available to scientists working in volcano observatories. !° 


10.7 Simulation of ambient noise 


The study of permanently recorded ambient seismic noise (ocean/atmosphere- 
generated or anthropogenic) is one of the most vibrant fields in seismology today. 
In particular the study of ocean-generated noise, with the option of perform- 
ing tomography in the absence of earthquake sources (Shapiro et al., 2005) and 
the possibility of studying the structure of Earth’s interior as a function of time 
(Brenguier et al., 2008) is revolutionizing our field. Ambient noise studies are 
predominantly data-processing tasks. Simulating the required long time series us- 
ing simulation techniques is a challenging problem. Nevertheless, in the quest 
to fully understanding the source mechanisms of the noise fields as well as their 
interaction with 3D structure, simulations are likely to play an important role. 
The requirements are: 


e Ambient noise is dominated by surface waves because sources usually 
act at the surface. Therefore, accurate implementation of the free-surface 
boundary condition is crucial. 


e For the noise field to develop, many wavelengths need to be propagated. 
This is challenging in 3D, requiring large models and highly accurate time- 
extrapolation schemes. 


e Depending on the desired Earth model complexity, surface topography (or 
bathymetry) needs to be taken into account (e.g. to explain Love waves in 
the ocean-generated noise). 


© Ocean-generated noise is characterized by continental or global-scale prop- 
agation distances and thus requires spherical geometry. 


e Cultural or anthropogenic noise can be simulated using a Cartesian 
framework. 


e Because of the required long time series, efficient absorbing boundaries are 
important when performing limited-area calculations. 


A seminal study using simulations in a spherically symmetric Earth model with 
normal modes (not discussed in this volume) was presented by Gualtieri et al. 
(2013). The noise sources are modelled using ocean wave information, along with 
bathymetry. The vertical component seismic noise spectra fit observed spectra. 
However, a discrepancy is found between the modelled and observed horizontal 
component spectra, which has been attributed to the existence of Love waves in 
the observed noise. At the time of writing, the origin of the Love waves in the 
ocean-generated noise is still not well understood. Answers are expected from 3D 
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10 While this can be a dream job, of- 
ten the observatories are understaffed and 
the maintenance of observational infras- 
tructure takes up all the time. 
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Fig. 10.23 Green’s function from noise 
simulations. One-bit normalized cross- 
correlations between the vertical dis- 
placements in the case of a uniform 
distribution of noise sources. Due to the 
uniform distribution of noise, symmetric 
Green’s functions emerge. From Cupil- 
lard and Capdeville (2010). 


simulations including bathymetry and internal 3D structures. In another synthetic 
normal-mode study Cupillard and Capdeville (2010) investigated the amplitude 
of surface waves reconstructed by noise correlation in the context of noise sources 
located on the surface of the Earth. An example of Green’s functions emerging 
from synthetic noise calculations is shown in Fig. 10.23. 

Noise simulations using a regional-scale spectral-element method can be 
found in Stehly et al. (2011). They study the effects of homogeneous distribu- 
tions of random noise sources located at the free surface of an attenuating and 
spherically symmetric Earth model. 

Noise studies at higher frequencies have been undertaken for quite some time 
to estimate local site effects studying the spectral ratio of horizontal components 
and vertical components of ground motion (H/V). A simulation example based 
on the finite-difference method (Moczo et al., 2002) can be found in Guillier et al. 
(2006). Several simple 2D and 3D structures are considered, and the locations of 
the H/V peaks in the spectra are investigated and discussed. 

It is fair to say that both spectral-element and finite-difference methods (flat 
surfaces) seem to be well suited for the tasks of ambient noise modelling. With 
increasing frequencies and more and more domains of application, simulations 
will play an important role in supporting the results obtained from real data 
processing. 


10.8 Elastic waves in random media 


Everyone who has looked at a geological outcrop in detail accepts that the Earth’s 
crust (and most likely the deep interior) is far from homogeneous, or even locally 
homogeneous. As discussed in the introductory chapter, whether the spatial scales 
of heterogeneities have to be taken into account for seismic wave propagation 
strongly depends on the (dominant) wavelengths of seismic waves. In the field 
of seismic scattering the effects of (random) spatial heterogeneities on wavefields 
are investigated. Strong effects are expected when the wavelengths of scatters and 
elastic waves are similar. 
Some requirements for random media calculations are: 


e The scale of the scattering medium has to be properly discretized by the 
computational mesh. 


e For strong scattering media and coda studies the implementation of 
absorbing boundaries is crucial for limited-area calculations. 


e In many cases the scattering of body waves is the target in which case free- 
surface boundary conditions are not relevant. 


e The sampling requirements of heterogeneities favour low-order methods 
(e.g. finite-difference methods). 


e Scattering problems require careful initialization of the elastic models with 
specified statistical properties. 


Understanding waves in random media was one of the first domains of applica- 
tion for computational seismology. Frankel (1989) reviews the early applications 
of numerical methods to the problem of waves in random media, discussing finite- 
element and finite-difference methods. A problem that is still under discussion 
today is the partitioning of energy into P- and S-waves, already investigated by 
Dougherty and Stephen (1988) for the oceanic crust using 2D finite-difference 
simulations. A comprehensive way to generate random media in 2D using el- 
lipsoidal autocorrelation functions and resulting wave effects simulated with the 
finite-difference methods was presented in Ikelle et al. (1993). 

Early this century, the extension to 3D media as well as more complex rhe- 
ologies became possible (Frenje and Juhlin, 2000; Martini et al., 2001; Bohlen, 
2002). All these studies were carried out with finite-difference methods. The ef- 
fect of random media on the analysis of travel times was investigated by Baig et al. 
(2003) and Baig and Dahlen (2004). Pham et al. (2009) used the 3D discon- 
tinuous Galerkin method to study crustal P-SH scattering, modelling rotational 
ground motions observed with a ring laser. 

Further recent finite-difference applications include the analysis of random 
media characterized by von Karmann correlation functions (Imperatori and Mai, 
2013) and the modelling of observed high-frequency P-wave fields in Japan 
(Takemura and Furumura, 2013) in the presence of irregular surface topography. 
A recent application of the spectral-element method to random media calculations 
can be found in Obermann et al. (2016) who investigate the depth sensitivity of 
time-dependent velocity changes observed using ambient seismic noise. 

Finally, the question of the short-scale structure of the Earth’s mantle is still 
under debate. Finite-difference simulations of random mantle structures using 
the axisymmetric approximation can be found in Igel and Gudmundsson (1997) 
for the upper mantle and in Jahnke et al. (2008) for the entire mantle. Large-scale 
global wavefield simulations in 3D random mantle models were performed by 
Meschede and Romanowicz (2015) (see Fig. 10.24). 


Chapter summary 


e ‘Today, seismic wave simulation in 3D media is essential for the solution of 
many solid Earth science problems. 


e There is no one-method-fits-all situation. The suitable simulation technol- 
ogy depends on the specific problem. 


e While mature simulation codes for 3D wave propagation have existed for 
some time, their extensive use to model observations has only recently 
started. 


e In the near future, computational resources should allow us to perform 
massive 3D calculations for parameter space studies, improved 3D imag- 
ing, the assessment of uncertainties, and our understanding of earthquakes 
and the associated strong ground motion. 
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Fig. 10.24 Random model in Carte- 
sian coordinates with a stationary expo- 
nential covariance function. The model 
was computed by filtering white noise 
to the power spectrum that is given by 
the Fourier transform of the covariance 
function descriptive of the random field. 
Figure from Meschede and Romanowicz 
(2015). 
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EXERCISES 


(10.1) 


(10.2) 


Create a matrix of application domains and numerical methods discussed 
in this volume. Discuss pros and cons of the methods for the various 
applications. 

Search the literature for current applications of 3D seismic wave simula- 
tions in your domain of interest. Discuss the computational set-up (mesh 
type, mesh size, regular vs. irregular) in connection with the specific nu- 
merical method employed. Is this the optimal method for the problem? 
How was the accuracy of the method verified? Would the results be fully 
reproducible? 


Current Challenges 
in Computational 
Seismology 


At least to some extent the motivation for writing this volume was the fact that now 
a large number of 3D simulation codes are in place. Most conceivable numerical 
methods have been applied to the wave-propagation problem. Some say the for- 
ward problem for seismic wave propagation is solved. This final chapter aims at briefly 
highlighting a few issues, showing that surprises might still be around the corner 
and there are many exciting challenges that will keep us busy for some time. 


11.1 Community solutions 


As indicated in the introduction, there is no way out of having to rely more and 
more on professionally engineered software solutions. This applies also to com- 
putational wave propagation, in particular when running on increasingly parallel 
supercomputer infrastructure. Projects like CIG (Computational Infrastructure 
in Geodynamics, <http://www.geodynamics.org>) provide parallelized software 
for a variety of problems in geophysics, ranging from mantle convection to crustal 
deformation to seismology on all scales. The software can be downloaded and 
must be installed by the researchers themselves. In many cases this works, but 
with increasing complexity of hardware, even this approach becomes more and 
more difficult. In addition, for the developing groups, raising funds to maintain 
the software is hard or impossible. 

To decrease time needed for research by those Earth scientists requiring 3D 
simulation technology we must go beyond this mode of operation. Ideally, a few 
well-developed codes that cover most of the standard workflow parts in com- 
putational seismology should be installed on the supercomputer infrastructure 
as modules, readily compiled, and permanently benchmarked with regression 
tests. An attempt in this direction was made with the EU-funded VERCE project 
(<http://www.verce.eu>, Atkinson et al., 2015). The goal was to develop a Web- 
based platform through which 3D simulation tasks can be initialized. In the 
course of this, community software was installed as pre-compiled modules on the 
European supercomputer infrastructure PRACE (<http://www.prace-ri.eu>). 

Does the community want that? What makes the acceptance of such mod- 
els difficult is the desire to make (slight or substantial) modifications to existing 
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Fig. 11.1 Homogenization. An initial 
discontinuous Earth model (a) is meshed 
(honouring discontinutties) (b) and the 
wavefield is simulated (e). A smooth ver- 
sion of the model (c) is simulated with 
a regular grid method (d) leading to 
the same wavefield (f) within some error 
bounds. From Capdeville et al. (2015). 
Reprinted with permission. 


software solutions. Another issue is that many codes require the provision of 
computational meshes, and currently there are no real standards to do that (see 
Section 11.3). 

In any case, funding bodies and the communities have to acknowledge that to- 
day software zs infrastructure, and it requires a substantial amount of funding for 
maintenance. The benefit for research will be substantial. The hope is that initia- 
tives like EPOS (European Plate Observing System, <http://www.epos-eu.org>) 
and CIG or EarthCube (both in the USA) can play a leading role in this direction. 


11.2 Structured vs. unstructured: 
homogenization 


With seismic observations we are looking at the Earth’s interior with a severely 
band-limited wavefield. How (on Earth) can we expect to recover a model that 
contains infinite spatial frequencies (e.g. layer boundaries)? Let me rephrase that: 
You want to simulate waves through a layered model (i.e. with sharp discontinu- 
ities). Is there a smooth model that leads to the same wavefield (within some 
bounds)? This is the question at the heart of homogenization, a method that 
has been developed by Yann Capdeville and co-workers in the past few years 
(Capdeville and Marigo, 2007; Capdeville and Marigo, 2008; Capdeville et al., 
2010a; Capdeville et al., 20106; Capdeville et al., 2013; Capdeville et al., 2015), 
extending the work by Backus (1962) and others. The concept is illustrated in 
Fig. 11.1 with the structurally complex 2D Marmousi model containing many 
velocity discontinuities. To obtain accurate seismic wavefields for such models the 
most promising approach is currently to honour the discontinuities and develop 


Classical mesh 


Homogenization 


Simple mesh wave modelling (2) 


a mesh that follows the layer boundaries. This is (1) extremely time-consuming, 
and (2) requires a solver that can handle irregular meshes. 

Homogenization allows the original discontinuous model to be replaced by a 
smooth model that does not contain layer discontinuities. Within some (tiny) error 
bounds the wavefields are the same (Fig. 11.1f). However, the latter approach 
has tremendous advantages: a solver based on a regular mesh at lower resolution 
implies a tremendous speed-up. This comes at the cost of a preprocessing step, 
which converts the initial discontinuous model into a smooth version. The hope 
is that libraries will be provided to do this. 

In my view, it is not only that homogenization could revolutionize forward 
modelling, leading to a revival of simple regular grid methods—the underlying 
theory is also important for the seismic inverse problem. In addition, recent de- 
velopments (Al-Attar and Crawford, 2016) allow mapping Earth models with 
complex topography to a computational model with flat surface. In combination 
with homogenization, this may open a route to modelling schemes where time- 
consuming computational mesh generation is replaced by a clever preprocessing 
scheme. The future will tell. 


11.3 Meshing 


For many problems of seismic wave propagation (volcanoes, rupture on com- 
plex faults, regions with complex topography) there is no way out of generating 
a high-quality mesh either using hexahedra or tetrahedra. As indicated in the 
introductory chapters, hexahedral meshes are more efficient in terms of overall 
simulation time whereas meshing is much easier using tetrahedra. 

The problem for seismology is the fact that there are no standardized 
workflows for geometry creation and meshing, even though efforts in connec- 
tion with spectral-element solvers (e.g. Casarotti et al., 2008) have helped in this 
direction. My experience is that the necessity to generate meshes for interesting 
seismological problems has significantly slowed down projects and endangered 
final success. The reasons are that (1) seismologists are usually not trained in 
computational geometry, (2) there does not seem to be any one-and-only soft- 
ware that solves all problems, (3) getting into the software and creating meshes 
is very time-consuming, (4) even well-matured commercial solutions are some- 
times not well adapted to the requirements of seismology, and (5) automation of 
the meshing process is close to impossible. 

While the developments in the field of homogenization might release the 
pressure of meshing for some problems, we still need to come up with stable, 
standardized meshing solutions. In addition, community libraries should be set 
up that collect computational meshes (e.g. for volcanoes, sedimentary basins, 
and continents) for further use (potentially with automated remeshing to finer 
scale). 
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Fig. 11.2 Chequerboard 
mographic 


test. 


resolution test 
chequerboard-like velocity perturbations 
(red and blue colours) on top of a layered 
background model und the Australian 
continent (true model at the top). For 
a given source-receiver geometry using 
full waveform inversion the model at 
the bottom is recovered. While this 
gives some qualitative estimation on 
how structures are recovered it does 
not replace a full (but computationally 
very expensive) quantitative uncertainty 
analysis. Figure from Fichtner et al. 


(2009a). 
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1 Tf you calculate a true solution with a 
certain forward solver and then solve the 
inverse problem with the same solver, even 
if you add random noise, this is called an 
inversion crime. This is standard practice 
for chequerboard tests. Assessing uncer- 
tainties when fitting real data is far more 
difficult. 


11.4 Nonlinear inversion, uncertainties 


In a recent international board meeting on supercomputing, a statement was 
made that the majority of projects seeking large computational resources are tar- 
geting problems related to uncertainties. This is not surprising, as (1) there is the 
expectation that Exaflops, not long after this volume is published, will open the 
way to increasing use of Monte Carlo approaches to inverse problems, and 2) in 
many fields the problem of properly estimating uncertainties (of any kind) is still 
basically unsolved. 

Seismology is not an exception. Everyone knows that chequerboard tests (see 
Fig. 11.2) are not the right approach, as usually the infamous ‘inversion crime’ 
is committed.! Despite recent progress in connection with waveform inversion 
(Fichtner and Trampert, 2011a; Fichtner and Trampert, 20115; Trampert et al., 
2013; Zhang et al., 2013, to name but a few) I think it is fair to say that we 
are far from being able to appraise our final tomographic models in a proper 
quantitative way. 

This has been a dangerous situation for many decades, as other fields, such 
as tectonics, geology, and geodynamics, base their research on results coming 
from seismic tomography. The fact that many of the velocity-perturbation models 
converge should not keep us from seeking a more quantitative approach to uncer- 
tainties. The efficient forward solvers that now exist, combined with increasing 
computer power which allows Monte Carlo model space searches, should point 
us in the right direction. 


Appendix A 


Community Software and Platforms 
in Seismology 


In this appendix, some information is provided on current open-source software, 
which, while not exhaustive, provides a useful focus on 3D seismic simulation 
(and inversion), seismic processing, benchmarking, and general services useful 
for seismological research (certainly incomplete). In addition, a brief description 
of the fupyter Notebooks is given with which the supplementary electronic material 
of this volume is provided. 


A.1 Wave propagation and inversion 


Finite-difference method 


e SOFI3D (<https://git.scc.kit.edu/GPIAG-Software/SOFI3D>) stands for 
Seismic mOdeling with FlInite differences, and is a 3D viscoelastic time 
domain massive parallel modelling code. 


e SW4 (<http:/Mwww.geodynamics.org/cig/software/sw4>) implements sub- 
stantial capabilities for 3D seismic modeling, with a free-surface condition 
on the top boundary, absorbing super-grid conditions on the far-field 
boundaries, and an arbitrary number of point force and/or point moment 
tensor source terms. 


e FDSim (<http://www.nuquake.eu/fdsim>), Fortran95 computer codes for 
numerical simulations of seismic wave propagation and earthquake motion 
in structurally complex 3D heterogeneous viscoelastic media. 


e SEISMIC_CPML (<https://geodynamics.org/cig/software/seismic_cpml>) 
is a set of eleven open-source Fortran90 programs to solve the two- 
dimensional or three-dimensional isotropic or anisotropic elastic, viscoelas- 
tic, or poroelastic wave equation using a finite-difference method with Con- 
volutional or Auxiliary Perfectly Matched Layer (C-PML or ADE-PML) 
conditions. 


e AWP-ODC (<http://hpgeoc.sdsc.edu/AWPODC>), anelastic wave propa- 
gation, independently simulates the dynamic rupture and wave propagation 
that occurs during an earthquake. Dynamic rupture produces friction, 
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traction, slip, and slip rate information on the fault. The moment function 
is constructed from this fault data and used to initialize wave propagation. 


e FDMAP  (<https://pangea.stanford.edu/~edunham/codes/codes.html>), 
dynamic ruptures and seismic wave propagation in complex geometries. 


Finite/Spectral-element method 


e SPCEFEM3D-CARTESIAN (<http://www.geodynamics.org/cig/software/ 
specfem3d>). SPECFEM3D CARTESIAN simulates acoustic (fluid), 
elastic (solid), coupled acoustic/elastic, poroelastic, or seismic wave prop- 
agation in any type of conforming mesh of hexahedra (structured or 
unstructured). 


e SPECFEM3D GLOBE (<https://geodynamics.org/cig/software/specfem 
3d_globe/>) simulates global and regional (continental-scale) seismic wave 
propagation. Effects due to lateral variations in compressional-wave speed, 
shear-wave speed, density, a 3D crustal model, ellipticity, topography and 
bathymetry, the oceans, rotation, and self-gravitation are all included. 


e SES3D (<http://www.cos.ethz.ch/software/ses3d.html>) is a program pa- 
ckage for the simulation of elastic wave propagation and waveform inversion 
in a spherical section. The package is based on a spectral-element dis- 
cretization of the seismic wave equation combined with adjoint techniques. 


e AXISEM (<http://www.geodynamics.org/cig/software/axisem>) is a paral- 
lel spectral-element method for 3D (an-)elastic, anisotropic, and acoustic 
wave propagation in spherical domains. It requires axisymmetric back- 
ground models and runs within a 2D computational domain, thereby 
reaching all desired highest observable frequencies (up to 2Hz) in global 
seismology. 

e REGSEM (<http://www.ipgp.fr/~paulcup/RegSEM.html>) is a versatile 
code based on the spectral-element method to compute seismic wave 
propagation at the regional scale. 

e EFISPEC (<http://efispec.free.fr>) stands for Element-Flnis SPECtraux. 
It solves the three-dimensional wave equations using a spectral-element 
method. 

e SEM2DPACK = (<http://web.gps.caltech.edu/~ampuero/software.html>) 
for dynamic rupture simulations non-planar faults in heterogeneous or 
non-linear media with the spectral-element method. 


Discontinuous galerkin method 


e SEISSOL (<http://www.seissol.org>) is a modal discontinuous Galerkin 
method primarily for wave and rupture propagation on tetrahedral meshes. 


e SPEED (<http:/Awww.speed.mox.polimi.it/SPEED>) is a discontinuous 
Galerkin spectral-element code that incorporates the open-source libraries 
METIS and MPI for the parallel computation (mesh partitioning and 
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message passing). It has been designed with the aim of simulating large- 
scale seismic events, allowing the evaluation of the typically multi-scale 
wave-propagation problems in its complexity, from far-field to near-field 
and from near-field to soil-structure interaction effects. 


@ NEXD (<http://www.rub.de/nexd>) is a software package for high-order 
simulation of seismic waves using the nodal discontinuous Galerkin method 


Other useful software 


e MINEOS (<http://www.geodynamics.org/cig/software/mineos>) computes 
synthetic seismograms in a spherically symmetric non-rotating Earth by 
summing normal modes. 


¢ GEMINI (www.geophysik.ruhr-uni-bochum.de/trac/gemini) is a program 
package to calculate Green’s functions and surface wave modes of the elastic 
wave equation for one-dimensional depth-dependent media. Applications 
of the code range from high-frequency, small-scale wave-propagation 
problems like ultrasonic waves, seam waves, and shallow seismics to 
continental-scale seismic waves from earthquakes. 


e LASIF (<http://www.lasif.net>) (LArge-scale Seismic Inversion Frame- 
work) is a data-driven end-to-end workflow tool to perform adjoint full 
seismic waveform inversions. 


e ASKI (<http://www.rub.de/aski>) is a highly modularized program suite 
for sensitivity analysis and iterative full waveform inversion using waveform 
sensitivity kernels for 1D or 3D heterogeneous elastic background media, 
and makes use of different forward modelling codes. 


e QSEIS (<http://www.gfz-potsdam.de>) is a Fortran code for calculating 
synthetic seismograms based on a layered viscoelastic half-space Earth 
model. 


e QSSP (<http://www.gfz-potsdam.de>) is a code for calculating complete 
synthetic seismograms of a spherical Earth using normal mode theory. 


A.2 Data processing, visualization, services 


e SEISMO-LIVE (<http://Awww.seismo-live.org>) is an open-source library 
with Jupyter Notebooks for seismology running on a dedicated server. 
Notebooks can be run on any browser. This site contains the supplementary 
material of this volume (— computational seismology). 


© ObsPy (<http://www.obspy.org>). ObsPy is an open-source project dedi- 
cated to providing a Python framework for processing seismological data. 
It provides parsers for common file formats, client access to data centres, 
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and seismological signal processing routines which allow the manipulation 
of seismological time series. 


ASDF (<http://www.seismic-data.org>). The Adaptable Seismic Data 
Format (ASDF) is a modern file format intended for researchers and ana- 
lysts. It combines the capability to create comprehensive data sets including 
all necessary meta information with high-performance parallel I/O for the 
most demanding use cases. 


SAC (<http://www.ds.iris.edu/ds/nodes/dmc/software/downloads/sac>). 
The Seismic Analysis Code is a general purpose interactive program 
designed for the study of sequential signals, especially time series data. 
Emphasis has been placed on analysis tools used by research seismologists 
in the detailed study of seismic events. 


EPOS (<http://www.epos-eu.org>). The European Plate Observing Sys- 
tem (EPOS) is a planned research infrastructure for European solid Earth 
science, integrating existing research infrastructures to enable innovative 
multidisciplinary research, recently prioritized by the European Strategy 
Forum on Research Infrastructures (ESFRD) for implementation. 


VERCE (<http://www.verce.eu>). VERCE developed a data-intensive 
e-science environment to enable innovative data analysis and data modelling 
methods that fully exploit the increasing wealth of open data generated 
by the observational and monitoring systems of the global seismology 
community. 


IRIS (<http://www.iris.edu>). The Incoporated Research Institutions in 
Seismology provide management of, and access to, observed and derived 
data for the global Earth science community. 


ORFEUS (<http://www.orfeus-eu.org>). Observatories and Research Fa- 
cilities for European Seismology is the non-profit foundation that aims 
at coordinating and promoting digital, broadband (BB) seismology in the 
European—Mediterranean area. 


CMT (<http://www.globalcmt.org>) provides moment tensor estimates for 
globally observable earthquakes. 


INSTASEIS (<http://www.instaseis.net>). Instaseis calculates broadband 
seismograms from Green’s function databases generated with AxiSEM and 
allows for near instantaneous (on the order of milliseconds) extraction of 
seismograms. 


A.3. Benchmarking 


e SISMOWINE (<http://www.sismowine.org>) is an interactive seismologi- 


cal web interface used for numerical modelling benchmarking. Participants 
calculate solutions for the defined models using their numerical or analytical 


computational method and compare the solutions with those submitted by 
other participants. 


e NUQUAKE (<http://www.nuquake.eu>) maintains a number of forward 
solutions in 1-3D and also provides access to analytical solutions. 


© QUEST (<http://www.quest-itn.org/library/software>) contains some sim- 
ple numerical simulation codes and several analytical solutions (e.g. Lamb’s 
problem). 


e SCEC-USGS (<http://scecdata.usc.edu/cvws>) Spontaneous Rupture 
Code Verification Project. Comparison of various 3D methods to simulate 
(spontaneous) rupture dynamics. 


e SIV (<http://equake-re.info/SIV>). The SIV project aims at quantifying the 
uncertainty in earthquake source inversion through a series of verification 
& validation experiments. 


e SEAM (<http://www.seg.org/resources/research/seam). The SEG Ad- 
vanced Modelling Program (SEAM) is a partnership between industry and 
SEG designed to advance geophysical science and technology through the 
construction of subsurface models and generation of synthetic data sets. 


A.4 Jupyter Notebooks 


The material presented in this volume is complemented by a substantial amount 
of Fupyter Notebooks (previously [Python Notebooks). The Jupyter Notebooks 
are based on an interactive computational environment, in which you can com- 
bine code execution, rich text, mathematics, plots, and rich media. The choice 
of Python over other programming languages is primarily due to its indepen- 
dence from commercial software for computer practicals. In addition, the Jupyter 
Notebooks offer a fascinating new tool to exchange software or practicals in a 
platform-independent way. The potential of this approach was recently recog- 
nized an article in Nature (Shen, 2014) providing access to an online example. 
A snapshot is shown in Fig. A.1. All Jupyter Notebooks will be provided in a 
novel openly accessible library of notebooks for seismology with online execution 
options (www.seismo-live.org). 


A.5 Supplementary material 


Details, instructions on how to install the Python environment, and access to 
the computer exercises are given on the website maintaining the electronic 
material: 


<http://www.computational-seismology.org> 


Supplementary material 
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Computational Seismology 


Numerical Integration - The Gauss-Lobatto-Legendre 

approach 

‘Authors: Heiner igel, Stephanie Wollher, Florian Wolf! 

The following notebook presents a basic integration scheme that we're going to use in the Spectral 

Element Code as well as in the Discontinuous Galerkin Code to calculate the entries of the mass 

and stifiness matrix. 
stom Nissenseyar et al, (2014), AniSEM:broscband 30 
seismic wavetelge in axisymmetric media. Solid Eat, 
59, «25-448, 


Fundamental I: 
Replace the function f(z) that we want to integrate by a polynomial approximation that can be integrated analytically 


AAs interpolating functions we use the Lagrange polynomials f, and obtain the following integration scheme for an arbitrary function f(z) defined on the 
interval (—1, 1] 
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‘ ‘ * 


Py(e)= 5° 129 40). 
Fig. A.1 Jupyter Notebook. The note- ne 
books can be run by any web browser. w= [Meee 
Text, graphs, and equations can be in- 
. Exercises 
serted using the mark-up language and ‘We want to investigate the performance of the numerical integration scheme. You can use the ‘glI" routine to obtain the differentiation weights w, for an 


. . arbitrary function ffx) and the relevant integration points 2; 
Latex scripting. Code blocks can be run 
1. Numerical integration of an arbritrary function: 


in dedicated windows, return data, and Define a function f(z) of your choice and calculate analical the integral f (2) dz forthe inteval [1,1] Perm the integration numerically and 


compare the results. 


graphs inside the notebook. 


Below is a non-exhaustive list of programs and tools available online. Many of 
these programs are also available in Matlab®. Complete codes are given as well as 
solutions to the computer exercises. You are advised to write as many codes from 
scratch as you can and compare them with the available solutions. This has by far 
the highest training value. The material includes: 


General 


e Introduction to Jupyter Notebooks 
e Introduction to Python 


Part I 


e Analytical solutions for the 1-3D acoustic wave equation in homogeneous 
media 


Analytical solutions for the double-couple point source in 3D 


Analytical solution to Lamb’s problem in 3D 


Examples for time reversal and reciprocity 


Numerical Green’s functions and convolution 
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Part II 


e ‘Taylor operators for finite-difference calculations 

e Finite-difference codes for 1D and 2D (acoustic case) 

e Finite-difference code with optimal operators for 1D 

e Staggered-grid finite-difference codes for 1D (elastic case) 


e Pseudospectral codes (Fourier and Chebyshev) for 1D acoustic and elastic 
wave propagation 


e Finite-element code for static elasticity and 1D elastic wave equation 
e Lagrange polynomials, interpolation, derivative 

e Legendre polynomials 

@ Gauss—Lobatto—Legendre collocation points 

e Gauss—Lobatto—Legendre integration 

e Spectral-element code for elastic wave propagation in 1D 


e Finite-volume code for scalar wave propagation and linear systems (elastic 
wave propagation) in 1D 


e Discontinuous Galerkin code for scalar wave propagation and linear 
systems (elastic wave propagation) in 1D 
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